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Preface 


The present book is a new revised and updated version of “Number Theory 
I. Introduction to Number Theory” by Yu.I.Manin and A.A.Panchishkin, ap- 
peared in 1989 in Moscow (VINITI Publishers) [Ma-PaM], and in English 
translation [Ma-Pa] of 1995 (Springer Verlag). 


The original book had been conceived as a part of a vast project, “En- 
cyclopaedia of Mathematical Sciences”. Accordingly, our task was to provide 
a series of introductory essays to various chapters of number theory, lead- 
ing the reader from illuminating examples of number theoretic objects and 
problems, through general notions and theories, developed gradually by many 
researchers, to some of the highlights of modern mathematics and great, some- 
times nebulous designs for future generations. 


In preparing this new edition, we tried to keep this initial vision intact. We 
present many precise definitions, but practically no complete proofs. We try 
to show the logic of number-theoretic thought and the wide context in which 
various constructions are made, but for detailed study of the relevant materials 
the reader will have to turn to original papers or to other monographs. Because 
of lack of competence and/or space, we had to - reluctantly - omit many 
fascinating developments. 


The new sections written for this edition, include a sketch of Wiles’ proof 
of Fermat’s Last Theorem, and relevant techniques coming from a synthesis 
of various theories of Part II; the whole Part III dedicated to arithmetical 
cohomology and noncommutative geometry; a report on point counts on va- 
rieties with many rational points; the recent polynomial time algorithm for 
primality testing, and some others subjects. 


For more detailed description of the content and suggestions for further 
reading, see Introduction. 


VI Preface 


We are very pleased to express our deep gratitude to Prof. M.Marcolli 
for her essential help in preparing the last part of the new edition. We are 
very grateful to Prof. H.Cohen for his assistance in updating the book, es- 
pecially Chapter 2. Many thanks to Prof. Yu.Tschinkel for very useful sug- 
gestions, remarks, and updates; he kindly rewrote 85.2 for this edition. We 
thank Dr.R.Hill and Dr.A.Gewirtz for editing some new sections of this edi- 
tion, and St.Kiihnlein (Universitat des Saarlandes) for sending us a detailed 
list of remarks to the first edition. 


Bonn, July 2004 Yu.I.Manin 
A.A.Panchishkin 
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Introduction 


Among the various branches of mathematics, number theory is characterized 
to a lesser degree by its primary subject (“integers”) than by a psychologi- 
cal attitude. Actually, number theory also deals with rational, algebraic, and 
transcendental numbers, with some very specific analytic functions (such as 
Dirichlet series and modular forms), and with some geometric objects (such 
as lattices and schemes over Z). The question whether a given article belongs 
to number theory is answered by its author’s system of values. If arithmetic 
is not there, the paper will hardly be considered as number-theoretical, even 
if it deals exclusively with integers and congruences. On the other hand, any 
mathematical tool, say, homotopy theory or dynamical systems may become 
an important source of number-theoretical inspiration. For this reason, com- 
binatorics and the theory of recursive functions are not usually associated 
with number theory, whereas modular functions are. 

In this book we interpret number theory broadly. There are compelling 
reasons to adopt this viewpoint. 

First of all, the integers constitute (together with geometric images) one of 
the primary subjects of mathematics in general. Because of this, the history 
of elementary number theory is as long as the history of all mathematics, and 
the history of modern mathematic began when “numbers” and “figures” were 
united by the concept of coordinates (which in the opinion of I.R.Shafarevich 
also forms the basic idea of algebra, see [Sha87]). 

Moreover, integers constitute the basic universe of discrete symbols and 
therefore a universe of all logical constructions conceived as symbolic games. 
Of course, as an act of individual creativity, mathematics does not reduce 
to logic. Nevertheless, in the collective consciousness of our epoch there does 
exist an image of mathematics as a potentially complete, immense and pre- 
cise logical construction. While the unrealistic rigidity of this image is well 
understood, there is still a strong tendency to keep it alive. The last but not 
the least reason for this is the computer reality of our time, with its very 
strict demands on the logical structure of a particular kind of mathematical 
production: software. 
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It was a discovery of our century, due to Hilbert and Gédel above all, 
that the properties of integers are general properties of discrete systems and 
therefore properties of the world of mathematical reasoning. We understand 
now that this idea can be stated as a theorem that provability in an arbitrary 
finitistic formal system is equivalent to a statement about decidability of a 
system of Diophantine equations (cf. below). This paradoxical fact shows that 
number theory, being a small part of mathematical knowledge, potentially 
embraces all this knowledge. If Gauss’ famous motto on arithmetic *) needs 
justification, this theorem can be considered as such. 

We had no intention of presenting in this report the whole of number theo- 
ry. That would be impossible anyway. Therefore, we had to consider the usual 
choice and organization problems. Following some fairly traditional classifica- 
tion principles, we could have divided the bulk of this book into the following 
parts: 


1. Elementary number theory. 

. Arithmetic of algebraic numbers. 

3. Number-theoretical structure of the continuum (approximation theory, 
transcendental numbers, geometry of numbers Minkowski style, metric 
number theory etc.). 

4. Analytic number theory (circle method, exponential sums, Dirichlet series 
and explicit formulae, modular forms). 

5. Algebraic-geometric methods in the theory of Diophantine equations. 

6. Miscellany (“wastebasket”). 


i) 


We preferred, however, a different system, and decided to organize our subject 
into three large subheadings which shall be described below. Because of our 
incompetence and/or lack of space we then had to omit many important 
themes that were initially included into our plan. We shall nevertheless briefly 
explain its concepts in order to present in a due perspective both this book 
and subsequent number-theoretical issues of this series. 


Part I. Problems and Tricks 


The choice of the material for this part was guided by the following principles. 

In number theory, like in no other branch of mathematics, a bright young 
person with a minimal mathematical education can sometimes work wonders 
using inventive tricks. There are a lot of unsolved elementary problems waiting 


“.. Mathematik ist die K6nigin von Wissenschaften und Arithmetik die Koni- 
gin von Mathematik. ...in allen Relationen sie wird zum ersten Rank erlaubt.” 
-Gauss. ..., cf. e.g. http://www. geocities.com/RainForest/Vines/2977 
/gauss/deutsch/quotes.html (“Mathematics is the queen of sciences and arith- 
metic the queen of mathematics. She often condescends to render service to as- 
tronomy and other natural sciences, but in all relations she is entitled to the first 
rank.” -Gauss. Sartorius von Walterhausen: Gauss zum Gedachtniss. (Leipzig, 
1856), p.79.) 
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for fresh approaches. Of course, good taste is still necessary, and this comes 
with long training. Also, nobody can tell a priori that, say, the ancient problem 
on the pairs of “friendly numbers” is a bad one, while the Fermat conjecture is 
a beauty but it cannot be approached without seriously developed technique. 
Elementary number theory consists of many problems, posed, solved and 
developed into theorems in the classical literature (Chapter 1), and also of 
many tricks which subsequently grew into large theories. The list of such 
tricks is still growing, as Apéry’s proof of the irrationality of ¢(3) shows. Any 
professional mathematician can gain by knowing some of these stratagems. 
In order not to restrict ourselves to very well known results we emphasize 
algorithmic problems and such modern applications of number theory as pub- 
lic key cryptography (Chapter 2). In general, the number-theoretical methods 
of information processing, oriented towards computer science (e.g. the fast 
Fourier transform) have revitalized the classical elementary number theory. 


Part II. Ideas and Theories 


In this part we intended to explain the next stage of the number-theoreti- 
cal conceptions, in which special methods for solving special problems are 
systematized and axiomatized, and become the subject-matter of monographs 
and advanced courses. 

From this vantage point, the elementary number theory becomes an imag- 
inary collection of all theorems which can be deduced from the Peano axioms, 
of which the strongest tool is the induction axiom. It appears in such a role in 
meta-mathematical investigations and has for several decades been developed 
as a part of mathematical logic, namely the theory of recursive functions. 
Finally, since the remarkable proof of Matiyasevich’s theorem, a further ac- 
complished number-theoretical fragment has detached itself from this theory 
— the theory of Diophantine sets. 

A Diophantine set is any subset of natural numbers that can be defined 
as a projection of the solution set of a system of polynomial equations with 
integral coefficients. The Matiyasevich theorem says that any set generated 
by an algorithm (technically speaking, enumerable or listable) is actually Dio- 
phantine. In particular, to this class belongs the set of all numbers of provable 
statements of an arbitrary finitely generated formal system, say, of axioma- 
tized set-theoretical mathematics (Chapter 3). 

The next large chapter of modern arithmetic (Chapter 4) is connected with 
the extension of the domain of integers to the domain of algebraic integers. 
The latter is not finitely generated as a ring, and only its finitely generated 
subrings consisting of all integers of a finite extension of Q preserve essential 
similarity to classical arithmetic. Historically such extensions were motivated 
by problems stated for Z, (e.g. the Fermat conjecture, which leads to the 
divisibility properties of cyclotomic integers). Gradually however an essen- 
tially new object began to dominate the picture — the fundamental symmetry 
group of number theory Gal(Q/Q). It was probably Gauss who first under- 
stood this clearly. His earliest work on the construction of regular polygons by 
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ruler-and-compass methods already shows that this problem is governed not 
by the visible symmetry of the figure but by the well-hidden Galois symmetry. 
His subsequent concentration on the quadratic reciprocity law (for which he 
suggested seven or eight proofs!) is striking evidence that he foresaw its place 
in modern class—field theory. Unfortunately, in most modern texts devoted to 
elementary number theory one cannot find any hint of explanation as to why 
quadratic reciprocity is anything more than just a curiosity. The point is that 
primes, the traditional subject matter of arithmetic, have another avatar as 
Frobenius elements in the Galois group. Acting as such upon algebraic num- 
bers, they encode in this disguise of symmetries much more number-theoretical 
information than in their more standard appearance as elements of Z. 

The next two chapters of this part of our report are devoted to algebraic- 
geometric methods, zeta-functions of schemes over Z, and modular forms. 
These subjects are closely interconnected and furnish the most important 
technical tools for the investigation of Diophantine equations. 

For a geometer, an algebraic variety is the set of all solutions of a system of 
polynomial equations defined, say, over the complex numbers. Such a variety 
has a series of invariants. One starts with topological invariants like dimension 
and (co)homology groups; one then takes into account the analytic invariants 
such as the cohomology of the powers of the canonical sheaf, moduli etc. The 
fundamental idea is that these invariants should define the qualitative features 
of the initial Diophantine problem, for example the possible existence of an 
infinity of solutions, the behaviour of the quantity of solutions of bounded size 
etc. (see Chapter 5). This is only a guiding principle, but its concrete realiza- 
tions belong to the most important achievements of twentieth century number 
theory, namely A.Weil’s programme and its realization by A.Grothendieck and 
P.Deligne, as well as G.Faltings’ proof of the Mordell conjecture. 

Zeta—functions (see Chapter 6) furnish an analytical technique for refining 
qualitative statements to quantitative ones. The central place here belongs to 
the so called “explicit formulae”. These can be traced back to Riemann who in 
his famous memoir discovered the third avatar of primes — zeroes of Riemann’s 
zeta function. Generally, arithmetical functions and zeroes of various zetas are 
related by a subtle duality. Proved or conjectured properties of the zeroes are 
translated back to arithmetic by means of the explicit formulae. This duality 
lies in the heart of modern number theory. 

Modular forms have been known since the times of Euler and Jacobi. They 
have been used to obtain many beautiful and mysterious number-theoretical 
results. Simply by comparing the Fourier coefficients of a theta-series with its 
decomposition as a linear combination of Eisenstein series and cusp forms, 
one obtains a number of remarkable identities. The last decades made us 
aware that modular forms, via Mellin’s transform, also provide key informa- 
tion about the analytic properties of various zeta-functions. 

The material that deserved to be included into this central part of our re- 
port is immense and we have had to pass in silence over many important devel- 
opments. We have also omitted some classical tools like the Hardy—Littlewood 
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circle method and the Vinogradov method of exponential sums. These were 
described elsewhere (see [Vau81-97], [Kar75], ...). We have said only a few 
words on Diophantine approximation and transcendental numbers, in partic- 
ular, the Gelfond—Baker and the Gelfond—Schneider methods (see [FelNes98], 
[Bak86], [BDGP96], [Wald2000], [Ch-L01], [Bo90]...). 

The Langlands program strives to understand the structure of the Galois 
group of all algebraic numbers and relates in a series of deep conjectures the 
representation theory of this group to zeta-functions and modular forms. 


Finally, at the end of Part II we try to present a comprehensive exposi- 
tion of Wiles’ marvelous proof of Fermat’s Last Theorem and the Shimura-— 
Taniyama—Weil conjecture using a synthese of several highly developed the- 
ories such as algebraic number theory, ring theory, algebraic geometry, the 
theory of elliptic curves, representation theory, Iwasawa theory, and defor- 
mation theory of Galois representations. Wiles used various sophisticated 
techniques and ideas due to himself and a number of other mathemati- 
cians (K.Ribet, G.Frey, Y.Hellegouarch, J—M.Fontaine, B.Mazur, H.Hida, J.— 
P.Serre, J.Tunnell, ...). This genuinely historic event concludes a whole epoque 
in number theory, and opens at the same time a new period which could be 
closely involved with implementing the general Langlands program. Indeed, 
the Taniyama-—Weil conjecture may be regarded as a special case of Langlands’ 
conjectural correspondence between arithmetical algebraic varieties (motives), 
Galois representations and automorphic forms. 


Part III. Analogies and Visions 


This part was conceived as an illustration of some basic intuitive ideas that 
underlie modern number-theoretical thinking. One subject could have been 
called Analogies between numbers and functions. We have included under this 
heading an introduction to Non-commutative geometry, Arakelov geometry, 
Deninger program, Connes’ ideas on Trace formula in noncommutative Geom- 
etry and the zeros of the Riemann zeta function ...Note also the excellent 
book [Huls94] which intends to give an overview of conjectures that dominate 
arithmetic algebraic geometry. These conjectures include the Beilinson conjec- 
tures, the Birch-Swinnerton-Dyer conjecture, the Shimura-Taniyama- Weil and 
the Tate conjectures, .... Note also works [Ta84], [Yos03], [Man02],[Man02a] 
on promising developments on Stark’s conjectures. 

In Arakelov theory a completion of an arithmetic surface is achieved by 
enlarging the group of divisors by formal linear combinations of the “closed 
fibers at infinity”. The dual graph of any such closed fiber can be described 
in terms of an infinite tangle of bounded geodesics in a hyperbolic handle- 
body endowed with a Schottky uniformization. In the last Chapter 8, largerly 
based on a recent work of Caterina Consani and Matilde Marcolli, we consider 
arithmetic surfaces over the ring of integers in a number field, with fibers of 
genus g > 2. One can use Connes’ theory to relate the hyperbolic geometry to 
Deninger’s Archimedean cohomology and the cohomology of the cone of the 
local monodromy N at arithmetic infinity. 
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We use the standard system of cross—referencing in this book. 


Suggestions for further reading 


A number of interesting talks on Number Theory can be found in the pro- 
ceedings of the International Congresses of Mathematicians in Beijing, 2002, 
in Berlin, 1998 and in Ziirich, 1994 (see [ICM02], [ICM98], [ICM94]). 

A quite complete impression on development of number-theoretic sub- 
jects can be obtained from Bourbaki talks : [Des90], [Bert92], [Fon92], [Oe92], 
[C1093], [Se94], [Bo95], [Se95], [0e95], [Goo96], [Kon96] [Loe96], [Wald96], 
[Abb97], [Fal98], [Mich98], [Colm2000], [Breu99], [Ma99], [Edx2000], [Ku2000], 
[Car02], [Hen01], [Pey02], [Pey04], [Coa01], [Colm01], [Colm03], [Bi02]. 


For a more detailed exposition of the theory of algebraic numbers, of Dio- 
phantine geometry and of the theory of Transcendental numbers we refer the 
reader to the volumes Number Theory I, II, and IV of Encyclopaedia of 
Mathematical Sciences see [Koch97], [La91], [FelNes98], the excellent mono- 
graph by J.Neukirch [Neuk99] (completed by [NSW2000]). We recommend 
also Lecture Notes [CRO1] on Arithmetic algebraic geometry from Graduate 
Summer School of the IAS/Park City Mathematics Institute. 
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Problems and Tricks 


1 


Number Theory 


1.1 Problems About Primes. Divisibility and Primality 


1.1.1 Arithmetical Notation 


The usual decimal notation of natural numbers is a special case of notation 
to the base m. An integer n is written to the base m if it is represented in the 
form 

n = dg-1m*—! + dp_gm*~? +--+ + do 


where 0 < dj < m—1. The coefficients d; are called m-—ary digits (or simply 
digits). Actually, this name is often applied not to the numbers d; but to the 
special signs chosen to denote these numbers. If we do not want to specify 
these signs we can write the m-—ary expansion as above in the form n = 
(dy—1dp—2..-dido)m. The number of digits in such a notation is 


k = [log,, n] + 1 = [logn/logm] +1 


where | ] denotes the integral part. Computers use the binary system; a binary 
digit (0 or 1) is called a bit. The high school prescription for the addition of 
a k-bit number and an [-bit number requires max(k,1) bit—-operations (one 
bit-operation here is a Boolean addition and a carry). Similarly, multipli- 
cation requires < 2kl bit-operations (cf. [Knu81], [Kob94]). The number of 
bit—operations needed to perform an arithmetical operation furnishes an es- 
timate of the computer working time (if it uses an implementation of the 
corresponding algorithm). For this reason, fast multiplication schemes were 
invented, requiring only O(k log k log log k) bit-operations for the multiplica- 
tion of two < k-bit numbers, instead of O(k?), cf. [Knu81]. One can also 
obtain a lower bound: there exists no algorithm which needs less than sous 
certaines restrictions naturelles on peut démontrer qu’il n’existe pas d’algo- 
rithme de multiplication des nombres a & chiffres avec le temps d’exécution 
inférieur 4 (k log k/(log log k)?) bit-operations for the multiplication of two 
general < k—bit numbers. 
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Notice that in order to translate the binary expansion of a number n into 
the m-ary expansion one needs O(k?) bit-operations where k = logy n. In 
fact, this takes O(k) divisions with remainder, each of which, in turn, requires 
O(kl) bit-operations where | = log, m. 

We have briefly discussed some classical examples of algorithms. These 
are explicitly and completely described procedures for symbolic manipulation 
(cf. [Mar54], [GJ79], [Man80], [Ma99]). In our examples, we started with the 
binary expansions of two integers and obtained the binary expansion of their 
sum or product, or their m—ary expansions. In general, an algorithm is called 
polynomial if the number of bit—operations it performs on data of binary length 
Lis bounded above by a polynomial in L. The algorithms just mentioned are 
all polynomial (cf. [Kob94], [Knu81], [Ma99], [Ries85]). 


1.1.2 Primes and composite numbers 


The following two assertions are basic facts of number theory: a) every natural 
number n > 1 has a unique factorization n = p{'p5?...p%" where pi < 
p2+++ <p, are primes, a; > 0; b) the set of primes is infinite. 

Any algorithm finding such a factorization also answers a simpler ques- 
tion: is a given integer prime or composite? Such primality tests are important 
in themselves. The well known Eratosthenes sieve is an ancient (3rd century 
B.C.) algorithm listing all primes < n. As a by-product, it furnishes the small- 
est prime dividing n and is therefore a primality test. As such, however, it is 
quite inefficient since it takes > n divisions, and this depends exponentially 
on the binary length of n. Euclid’s proof that the set of primes cannot be 
finite uses an ad absurdum argument: otherwise the product of all the primes 
augmented by one would have no prime factorization. A more modern proof 
was given by Euler: the product taken over all primes 


M(-*) = (+2+5+...) (1.1.1) 


P P 


would be finite if their set were finite. However, the r.h.s. of (1.1.1) reduces to 
the divergent harmonic series }>>~_, n~' due to the uniqueness of factorization. 
Fibonacci suggested a faster primality test (1202) by noting that the small- 
est non-trivial divisor of nis < [,/n] so that it suffices to try only such numbers 
(cf. [Wag86], [APR83]). 
The next breakthrough in primality testing was connected with Fermat’s 
little theorem (discovered in the seventeenth century). 


Theorem 1.1 (Fermat’s Little Theorem). [fn is prime then for any in- 
teger a relatively prime to n 


a1 =1(mod n), (Lia) 
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(It means that n divides a”~' — 1). The condition (1.1.2) (with a fixed a) is 
necessary but generally not sufficient for n to be prime. If it fails for n, we 
can be sure that n is composite, without even knowing a single divisor of it. 
We call n pseudoprime w.r.t. a if gcd(a,n) = 1 and (1.1.2) holds. Certain 
composite numbers n = 561 = 3-11-17, 1105 = 5-13-17, 1729 = 7-13-19 
are pseudoprime w.r.t. all a (relatively prime to n). Such numbers are called 
Carmichael numbers (cf. [Kob94], [LeH.80]). Their set is infinite (it was proved 
in [AGP94]). For example, a square-free n is a Carmichael number iff for any 
prime p dividing n, p — 1 divides n — 1. 

A remarkable property of (1.1.2) is that it admits a fast testing algorithm. 
The point is that large powers a™ mod n can be readily computed by repeated 
squaring. More precisely, consider the binary representation of n — 1: 


m=n—1=dy_12""' + de_gt:--+do 


with dy_; = 1. Put r; =a mod n and 


r? mod n_ if dg_1_, = 0 
Th — 
ve ar? mod n_ if dy_1-;=1 


Then a”-! =r, mod _ n because 
a” t= (i. (Ca? 8-2)? ates)? bd Ja%, 


This algorithm is polynomial since it requires only < 3[log, | multiplica- 
tions mod n to find rz. It is an important ingredient of modern fast primal- 
ity tests using the Fermat theorem, its generalizations and (partial) converse 
statements. 

This idea was used in a recent work of M. Agrawal, N. Kayal and N. 
Saxena: a polynomial version of (1.1.2) led to a fast deterministic algorithm 
for primality testing (of polynomial time O(log n)'***), cf. §2.2.4. 

Fermat himself discovered his theorem in connection with his studies of 
the numbers F;, = 2?" —1. He believed them to be prime although he was able 
to check this only for n < 4. Later Euler discovered the prime factorization 
Fs = 4294967297 = 641 - 6700417. No new prime Fermat numbers have been 
found, and some mathematicians now conjecture that there are none. 

The history of the search for large primes is also connected with the Mer- 
senne primes M, = 2? —1 where p is again a prime. To test their primality one 
can use the following Lucas criterion: My(k > 2) is prime iff it divides L,_ 
where L,, are defined by recurrence: DL; = 4, Dy = 2-2. This requires much 
less time than testing the primality of a random number of the same order 
of magnitude by a general method. Mersenne’s numbers also arise in various 
other problems. Euclid discovered that if 2? — 1 is prime then 2?~1(2? — 1) is 
perfect i.e. is equal to the sum of its proper divisors (e.g.6 =1+2+3, 28= 
1424447414, 496 =14+2+4+4+5+4164 31+ 62+ 124 + 248), and 
Euler proved that all even perfect numbers are of this type. It is not known 


12 1 Elementary Number Theory 


whether there are any odd perfect numbers, and this is one baffling example of 
a seemingly reasonable question that has not lead to any number-theoretical 
insights, ideas or tricks worth mentioning here. 

Euler also knew the first eight prime Mersenne numbers (corresponding to 
p= 2, 3, 5, 7, 13, 19, 31. Recently computer-assisted primality tests have 
furnished many new Mersenne primes, e.g. the 42nd known Mersenne prime, 
discovered by Dr. Martin Nowak on February 26 (2005), is 2?°:°4,9°!—1. It has 
7,816,230 decimal digits. It is therefore not only the largest known Mersenne 
prime, but also the largest known prime of any kind.*) 

In Chapter 4 we consider some other modern methods of primality testing, 
in particular using elliptic curves (ECPP by Atkin—Morain). 


1.1.3 The Factorization Theorem and the Euclidean Algorithm 


For integers a,b we write alb if a divides b i.e., b = ad for some integer d. If 
p is a prime and p® is the highest power of p dividing n we write p*||n and 
a = ord,n. The factorization theorem can be easily deduced from its special 
case: if a prime p divides ab then either pla or p|b. Below we shall prove this 
property using the Fuclidean algorithm. Knowing the prime factorizations of 
a and b one readily sees the existence and the explicit form of the greatest 
common divisor gcd(a,b) and the least common multiple lem(a, b). Namely, 
put m, = min(ord,(a), ord,(b)), gp = max(ord,(a), ord,(b)). Then 


gcd(a, b) =|[>™: Icm(a, b) = II». 


P 


Again, the Euclidean algorithm allows us to prove the existence and to find 
efficiently gcd(a,b) without even knowing the prime factorizations. Assume 
that a > b > 1. The algorithm consists of calculating a sequence ro, £1, X2,... 
where %p = a, ©, = b and 24,1 is the residue of x;_,; modulo x;. One stops 
when x, = 0; then 2,1 = gcd(a,b). The number of required divisions is 
bounded by 5log,) max(a,b) (Lamé’s theorem) (cf. [Knu81], [Wun85]). The 
slowest instances for the Euclidean algorithm are the neighbouring Fibonacci 
numbers a = Uz, 6 = up—1 where up = wy = 1 and wai = uj + Uji. The 
Euclidean algorithm also furnishes a representation 


gcd(a,b) = Aa+ Bb (1.1.3) 


where A, B are integers. In order to find these, we shall consecutively define 
pairs (A;, Bi) such that Lis AjXo + Bx}. Put Ao = By = 1, Ay = Bo =0 
and for i > 1 


“ See http://www.mersenne.org and http://mathworld.wolfram.com/news/ for 
updates and for the history, e.g. two previous values are 20996011 and 24036583. 
Another recent record is the factorization of Mos3 (Bahr F., Franke J. and Klein- 
jung T. (2002) (footnotes by Yu.Tschinkel and H.Cohen). 
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Asi = Ay-1—tAj, Bj = By_-1 — tB; 


where t is given by 2441 = 2;_1 —ta;. Since gcd(a, b) = a,_1 we can take A = 
Arz-1, B = By-1. Finally, if p|ab for a prime p and p does not divide a then 
gcd(a,p)=1 so that Aa+ Bp = 1 for some integers A, B. Hence Aab+ Bpb = b 
and p divides b. 


1.1.4 Calculations with Residue Classes 


From the algebraic viewpoint, the set of integers Z is an associative commu- 
tative ring with identity. The general divisibility theory in such rings uses the 
fundamental notion of an ideal. An ideal J in a ring R is a subset which is an 
additive subgroup with the property RIR Cc I. 

An ideal of the form J = aR, a € A is called a principal ideal and is 
denoted (a). The divisibility relation a|b is equivalent to the inclusion relation 


(6) C (a) or bE (a). 


Any ideal I of Z must be principal since its elements are all divisible by the 
smallest positive element of J. The maximal ideals (ordered by inclusion) are 
precisely those which are generated by primes. The numbers having the same 
remainder after division by a fixed N, form N classes with pairwise empty 
intersections 

@=a+NZ, 0<a<N-1, 


the set of which also has a natural commutative associative ring structure 
with identity 
Z/NZ=Z/(N) = {0,1,...,N— I}. 


We traditionally write a = b (mod N) in place of a = b. Often one succeeds in 
reducing some calculations in Z to calculations in an appropriate residue ring 
Z/NZ. Besides finiteness, one useful property of this ring is the abundance of 
invertible elements (while in Z there are only +1). Actually, @ is invertible iff 
gcd(a, N) = 1 since the equation ax + Ny = 1 or, equivalently, @.% = 1 can be 
solved exactly in this case with integers x, y. The group of all invertible residue 
classes is denoted (Z/NZ)*. Its order y(NV) is called Euler’s function. Euler 
introduced it in connection with his generalization of the Fermat theorem: 


a?) = 1(mod N) (1.1.4) 


for any a relatively prime to N, ie. a?(%) = I for any invertible element 
a in Z/NZ. Euler’s conceptual proof shows in fact that in a finite Abelian 
group of order f the order of an arbitrary element a divides f. In fact, the 
multiplication by a is a permutation of the set of all elements. The product 
of all elements is multiplied by af under this map. Hence a/ = 1. 

If N = N,N2...N, and N; are pairwise coprime we have a canonical 
isomorphism 
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Z/NZ&Z/NiZ®---®Z/N,Z. (1.1.5) 


The main part of this statement is called the Chinese Remainder Theorem : for 
any ajmod N;, i=1,...,k there exists an a such that a=a; mod JN; for 
all 7. Again, such an a can be efficiently found using the Euclidean algorithm. 
Put M; = N/N;. By assumption, M; and N; are relatively prime. Find X; 
with X,;M; =1mod JN; and put 


k 
a= 5° aiXiMi. (1.1.6) 
i=l 


This is what we sought. From (1.1.5) we deduce the corresponding factoriza- 
tion of the multiplicative group 


(Z/NZ)* & (Z/N,Z)* ® «+» ® (Z/N,Z)*, (1.1.7) 


which shows in particular that y(N) = y(M1)...¢~(Nx). Since for a prime 
p we have y(p*) = p*!(p— 1) this allows us to find y(N) given the prime 
factorization of N. 

In the special case when N = q is prime the ring Z/NZ is a field: all its 
non-zero elements are invertible. For a prime p, the notation F,, is used for 
the field Z/pZ. The group (Z/NZ)”* is cyclic: it coincides with the set of all 
powers of an element t = t, (it is not unique). No efficient (e.g. polynomial) 
algorithm for finding such a primitive root is known. 

Recall Artin’s conjecture (on primitive roots): If a € Zis not —1 or a perfect 
square, then the number N(z, a) of primes p < x such that a is a primitive root 
modulo p is asymptotic to C(a)m(x), where C(a) is a constant that depends 
only on a. In particular, there are infinitely many primes p such that a is a 
primitive root modulo p. (Note that another famous Artin’s conjecture (on 
the holomorphy of L series) will be discussed in §6.4.5). Nobody has proved 
this conjecture (on primitive roots) for even a single choice of a. There are 
partial results, e.g., that there are infinitely many p such that the order of a 
is divisible by the largest prime factor of p—1. (See, e.g., [Mor93] and [HB86], 
[BrGo02]). Neither can one efficiently compute the “discrete logarithm”, (or 
index) «x = ind,(a) defined for an invertible a mod q by 


a=t* mod q, 0<a2<q-l. (1.1.8) 


It is an important unanswered question whether such algorithms exist at all. 
However, there are fast ways for calculating ind; if all prime divisors of q — 1 
are small (cf. [Kob94]). First of all, one computes for all p dividing gq — 1 the 
residue classes 

pj =PO-V/P, fg =0,1,...,p—1 


lying in (Z/qZ)*. This can be efficiently done by the iterated squaring method 
(cf. 1.1.2). Let a, = ord,(q— 1). It suffices to compute all the residues + mod 
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p°» and then to apply the Chinese Remainder Theorem (1.1.5). We fix p, 
Q@ = Q, > 0 and try to to find x mod p® in the form 


Z=to+aypt+---+2tq_1p* (mod p%), O<a;<p-l. 


Since a?—! = 1 mod q the residue a(@—))/? is a p** root of unity. From a = 
t” mod q it follows that 


g@-D/P = 42(9-1)/P = 4r0Q-1)/P = Tp,x9(mod q). 


Therefore we can find the first digit 29 by computing a(?—!)/? and comparing 
it with the precomputed list of rp;. In order to find the next digit x; we first 
replace a by a; = a/t”°. Then we have 


ind,(a1) = ind;(a) — 29 = 41p+---+ 2ap**(mod p%). 


) 


As ay is a p** power we obtain from here ait} /P =1 mod q and 


a D/P" = 4(@-#0)(q-1)/p? = 4(eitpeet...)(q-1)/p = pxi(q-l)/P = To he 


Therefore, one can discover 21 by finding alt D/ a among the precomputed 
list of rp,;. One computes the other digits x; in the same way. The same list 
can be used for various a’s, g and ¢ being fixed. This is the Silver—Pollig— 
Hellman algorithm, cf. [Kob94]. It becomes impractical if g— 1 is divisible by 
a large prime because then the table of r,,; becomes too long. The difficulty 
of computing ind (and the general factorization problem) is utilized in cryp- 
tography (cf. Chapter 2, §2.1.6, [DH76], [Hel79], [ARS78] , [Od184] , [Od187], 
[Go02]). 


1.1.5 The Quadratic Reciprocity Law and Its Use 


Let p and q be odd primes. The main part of the quadratic reciprocity law first 
proved by Gauss, states that if p = q =3mod 4 then the solvability of one 
of the congruences x? = pmod q and x? = q mod p implies the insolvability 
of the other; in all other cases they are simultaneously solvable or unsolvable. 
Gauss used this in order to compile large tables of primes. 

To this end, he refined the primality test based on Fermat’s congruence 


(1.1.2). Namely, define the Legendre symbol (=) for a prime n by 
n 


0 ifa@=O0mod n, 
(=) = 1 if a= b*? mod n, 
—1 otherwise. 


Then from the cyclicity of (Z/nZ)* it follows that 
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a 


gre = (=) (mod n). (1.1.9) 


If n is not prime we define the Jacobi symbol by multiplicativity: for an odd 
positive n = p,po...py where p; are primes, not necessarily distinct, put 


@)-()-G) am 


Now formula (1.1.9), which holds for the Jacobi symbol when n is prime, 
can be used as a primality test. Actually, the Jacobi symbol can be extended 
to all values of the “numerator” and “denominator” and computed without 
knowing the prime factorization of n. This is done with the help of the extended 
quadratic reciprocity law 


() (z) BN ee cas (1.1.11) 


(3) = (-1)"-D/8 (+) = (-1)?-V? (1.1.12) 


together with the multiplicativity property with respect to both “numera- 
tor’ and “denominator”. The computation follows the same pattern as the 
Euclidean algorithm and requires < log max(P,Q) divisions with remainder. 
A natural number n is called an Eulerian pseudoprime w.r.t. aif gcd(a,n) = 1 
and (1.1.9) holds. Using the chinese remainder theorem, one can prove that 
if n is pseudoprime w.r.t. all a € (Z/nZ)* then n is prime. Thus, there are 
no Eulerian analogues of the Carmichael numbers. Moreover, it was argued in 
[Wag86] that if n is composite then there is an a < 2lognloglogn such that 
n is not an Eulerian pseudoprime w.r.t. a. 

The congruence (1.1.9) is used in the modern fast primality tests which 
will be considered in Chapter 2 (cf. [ARS78], [Mil76], [LeH.80], [Vas88]). 

The primality tests work much faster than all known methods for factor- 
izing “random” large integers, see §2.3. 

To conclude this subsection we say a few words about a subject which 
has traditionally caught the attention of many unselfish amateurs of number 
theory: that of finding “a formula” for primes. Euler noticed that the polyno- 
mial x? + x +41 takes many prime values. However, it was long known that 
the values of an arbitrary polynomial f(21,...,2n) € Zla1,...,2,] at integer 
points cannot all be prime, e.g. because if p, q are two large primes, then 
the congruence f(21,...,%n) = 0 mod _ pg is always solvable. Nevertheless, 
using methods from the theory of recursive functions, one can construct a 
polynomial (in fact, many) whose set of positive values taken at lattice points 
coincides with the set of all primes. The following specimen was suggested in 
[JSWW76]. It depends on 26 variables that can be conveniently denoted by 
the letters of the English alphabet: 
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F(a,b,c,d,e, f,g,h,i,j,k,l,m,n,0,p, 9,7, 8,t, u,v, W,2,Y,2) = 
(k+2){1— [wz +h+j-q]? —[(gk+2g+k+1)(h+j) +h- 2)? 
In+p+q+z—e]? —[16(k+ 1)3(k+2)(n +1)? +1- f?])?- 
(e+ 2)(a+ 1)? +1— 07]? — (a? — Dy? +1-—277 

16r?y*(a? — 1) +1 —u?]?— 

((a + u?(u? — a))? — 1)(n + 4dy)? +1 — (2 + cu)?)?- 
(ntl+v—y) -[(@? —1DP+1-—m’?)? -(@i+k+1-1-1) 
p+lUa—n—1)+ b(2an + 2a —n? — 2n — 2) —m)? 

qt y(a—p—1) + s(2ap + 2a — p? — 2p — 2) — a)? 

z+ pl(a— p) + t(2ap — p® — 1) — pm]?}. 


We also mention an inductive description of the sequence of all primes that 
can be derived by combinatorial reasoning (cf. [Gan71]): 


Pnti = [1 — logs an] (1.1.13) 


where 


. (-1)" 
a=), dU ae een 


r=1 1<iy<--<i,-<n 


1.1.6 The Distribution of Primes 


A first glance at a table of primes leaves an impression of chaos. For several 
centuries, mathematicians compiled large tables of primes in an attempt to 
see some order in them. Pell’s table (1668) lists all primes not exceeding 10°. 
Lehmer D.H. in [Leh56] published his well known tables containing all primes 
up to 107. In [PSW80] one can find all Fermat pseudoprimes n < 25 - 10° 
verifying the congruence 2”~! = 1 mod_n. 


Already the first tables allowed the experimental study of the statistical 
distribution of primes, which seemed to be more accessible at least asymptot- 
ically. Put 

a(x) = Card{p | p prime < z}. 


The graph of this step function even up to x = 100 looks pretty regular. For 
x < 50000 where the jumps are hidden by the scale, the regularity is striking 
(cf. Fig. 1.1 and 1.2). 


Computing x/7(a) we see that for large x it becomes close to log x. One 
sees also from Table 1.1 that that when we multiply x by 10, then 


10 
=a i a + log 10, and log(10z) = log(a) + log 10 © log(a) + 2,3. 
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Table 1.1. For large x the ratio x/m(x) becomes close to log x: 


10 4 2,5 
100 25 4,0 

1 000 168 6,0 

10 000 1 229 8,1 

100 000 9 592 10,4 

1 000 000 78 498 197 

10 000 000 664 579 15,0 

100 000 000 5 761 455 17,4 

1 000 000 000 50 847 534 19,7 
10 000 000 000 455 052 512 22,0 


Actually, the asymptotic law of the distribution of primes (or prime number 
theorem), 


x 


1 (x) 


~ 1.1.14 
log x ( ) 


(meaning that the quotient of the two sides tends to 1 as x tends to infinity) 
was conjectured by the fifteen year old Gauss on the basis of his studies of the 
available tables of primes, and proved by analytical methods only in 1896 by 
Hadamard and de la Vallée-Poussin [Pra57], [Kar75]). Before that, in 1850, 
P.L.Chebyshev (cf. [Cheby55]) found a very ingenious elementary proof of the 
inequality 
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80 ra 
log x log x 
For this he used only the divisibility properties of the binomial coefficients. 
The asymptotic law itself was finally proved in an elementary way in 1949 by 
Selberg and Erdos (cf. [Sel51]). 

Gauss also suggested a much better approximation to a(x). Computing 
his tables of primes he noticed that if one counts primes in sufficiently large 
intervals around a large x their density tends to be close to 1/log x. For this 
reason he decided that a better approximation to 7(x) would be the integral 


logarithm 
” dt 
Li(x) = —. 
ve ) logt 


This observation was refined by Riemann, cf. [Rie1858]. Investigating the zeta- 
function he came to an heuristic conclusion that Li(x) should be a very good 
approximation to the function counting all powers of primes < x with the 
weight equal to the power, that is 


m(a) 4 a: Eataay +--+. Li(z). (1.1.15) 


If one wants to express (x) via Li(a) from here one should use the Mébius 
function 


i ifn =1, 
p(n) = <0 n_ is divisible by a square of a prime, (1.1.16) 
(-1)* — otherwise, 


where & is the number of primes dividing n. Let us consider the function 


F(x) = y * x(a"). Chalet 7) 
Then _ 
n(x) = s HO) (ello), (1.1.18) 
and - 


n(x) o HOP) Ti cagt/ny, (1.1.19) 


The special case (1.1.18) of a general inversion formula easily follows from the 
main property of the Mobius function: 


ie Set 
ces 1.1.20 
2 H(d) fs ifn>1. ety) 


d|n 
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In fact, if n = []3_, p¢*, a; > 0 then for s > 1 we have 
s § s 
2 Hn) =a (-1)*(,) =@-1 =0. 
k=0 


The right hand side of (1.1.19) is denoted R(x). Table 1.2 (cf. [Ries85], [RG70], 
[Zag77|) shows how well it approximates (2). 


Table 1.2. 

x R(x) 1(x) 
100000000 5761455 5761552 
200000000 11078937 11079090 
300000000 16252325 16252355 
400000000 21336326 21336185 
500000000 26355867 26355517 
600000000 31324703 31324622 
700000000 36252931 36252719 
800000000 41146179 41146248 
900000000 46009215 46009949 
1000000000 50847534 50847455 


It is useful to slightly renormalize Li(z) taking instead the complex integral 


Herr) = aay (v #0). (1.1.21) 


—oot+iv z 
For « > 2, li(x) differs from Li(x) by the constant li(2) + 1,045. The Riemann 


function 
CO 
= wll”) 


is an entire function of logx. It can be expanded into a rapidly convergent 
power series 


co tm 
where x = e’, and 
De ae: (aa: (1.1.23) 


n=1 p prime 


Of course, this Riemann zeta function is the main hero of the story. Its proper- 
ties, established or conjectural, govern the behaviour of 7(x). Riemann showed 
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how to extend ¢(s) meromorphically to the whole complex plane (notice that 
(1.1.23) converges only for Re(s) > 1) and he deduced the astonishing explicit 
formula for (x). This looks as follows: 


Fo(x) = li(x) — S“li(w”) 4 i iw on log 2, (1.1.24) 


where the sum is taken over all zeros p of ¢(s), and 


F F(a - 
Fa(e) = lim eres (x e) 


The formula ( 1.1.24) was published by Riemann in 1859 and proved by Man- 
goldt in 1895. The series in (1.1.24) is only conditionally convergent. If one 
excludes the “trivial zeroes” p = —2, —4, —6,... whose contribution is insignif- 
icant the remaining summation should be made in the order of increasing |p|. 
The set of non-trivial zeros is symmetric with respect to complex conjugation 
and lies in the critical strip 0 < Re(s) < 1. The first five roots with positive 
imaginary part, up to eight decimal digits, are (cf. [Zag77], [Ries85], [RG70] ) 


a= ; + 14, 1347353, 
02 = ; + 21, 0220403, 
03 = ; + 25, 010856i, 
pa = ; + 30, 4248783, 
ps = ; + 32, 935057i. 


Let us consider the number 6 = sup Re(p). From (1.1.24) it follows that 
n(x) — li(a) = O(a° log x). (1.1.25) 


This estimate would be non-trivial if we knew that 6 < 1. Unfortunately, it is 
only known that there are no roots on Re(s) = 1 and in a small neighbourhood 
of this line whose width tends to zero as |s| grows (cf. [Pra57]). The famous 
Riemann hypothesis, that all non-trivial roots lie on the line Re(s) = 3 is 
still unproved. A corollary of this would be 


n(x) = li(x) + O(a'/? log x). 


These questions, however, lie far outside elementary number theory. 
We shall return to the Riemann—Mangoldt type explicit formulae below, 
cf. Part II, Chapter 6, §6.2. 


1.2 Diophantine Equations of Degree One and Two 


1.2.1 The Equation az + by =c 


In this section, all coefficients and indeterminates in various equations are 
assumed to be integers unless otherwise stated. Consider first a linear equation 
with two indeterminates. The set 


I(a,b)={c | axv+by=c is solvable} 


coincides with the ideal generated by a and b that is, with dZ where d = 
gcd(a, b). It follows that the equation 


axz+by=c (1.2.1) 


is solvable iff d divides c. A special solution can be found with the help of 
the Euclidean algorithm: first compute X, Y with aX + bY = d and then put 
Xo = eX, yo = eY where e = c/d. One easily sees that the general solution is 
given by the formula 


c= xot(b/d)jt, y= yo (a/d)t, 


where ¢ is an arbitrary integer. 
Equation (1.2.1) is the simplest example of the general Diophantine prob- 
lem of investigating systems of polynomial equations 


Fy(a1,.--,%n) =0, +++, Fm(a1,..-,¢n) = 0 (1.2.2) 


with integral coefficients. We see that all the main questions can be effectively 
answered for (1.2.1): the existence of solutions, computation a single solution, 
description of the set of all solutions, counting solutions in a box etc. We shall 
consider more complicated instances of (1.2.2) and attempt to extend these 
results. 


1.2.2 Linear Diophantine Systems 


The Euclidean algorithm allows us to investigate in the same way a general 
linear Diophantine system 


Ax = b, (1.2.3) 
where 


@11 412 *** Gin X41 by 


Gm1 Am2°*** Amn 
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This can be done with the help of the elementary divisor theorem. Recall 
that an elementary operation on the rows of a matrix over Z adds to one row 
an integral multiple of another. One defines an elementary column operation 
similarly. An elementary operation is equivalent to multiplication of the initial 
matrix on the left (resp. on the right) by a matrix of the form Ej; = E+ Ae;; 
belonging to SLm(Z) (resp. SL,(Z)). By repeated application of elementary 
operations we replace A by UAV where U and V are unimodular matrices 
with integral entries. On the other hand, the system 


UAVy =Ub (1.2.4) 


is equivalent to (1.2.3) since their solutions are in one-to-one correspondence: 
x = Vy. We can use this if we manage to replace A by a simpler matrix A’ = 
UAV. In fact, using the Euclidean algorithm and a version of the Gaussian 
elimination procedure avoiding divisions, one can find a matrix A’ of the form 


d, 0 0 0 
0 do 0 0 
oo eres 4g | =UAV. (1.2.5) 


Hence we either see that our system has no solutions even in Q, or we obtain 
the set of all rational solutions from the very simple system djy; = cj, c= Ub 
for i < r, y; = 0 for the other 7. The set of integral solutions is non-empty 
iff d; divides c; for 1 < r, and can then be parametrized in an obvious way. 
The product d; -----d; coincides with gcds of all minors of A of order i and 
d;|dj41. They are called the elementary divisors of A. It follows that (1.2.3) 
is solvable iff the elementary divisors of A of orders < m coincide with those 
of the extended matrix (with the column b added). In turn, this is equivalent 
to the simultaneous solvability of the congruences 


Az = b(mod N) 


where N is an arbitrary integer. Such a condition can be readily extended to 
a completely general system of Diophantine equations. Clearly, it is necessary 
for the existence of a solution. The above argument shows that for (1.2.3) it 
is also sufficient. When this is true for a class of equations one says that the 
Minkowski—Hasse principle is valid for this class. The question of the validity 
of the Minkowski—Hasse principle is a central problem in this theory. We shall 
discuss it below in 81.2.4 and in Part II, §4.5, §5.3. 

More difficult problems arise if one wants to find “the smallest solution” 
to (1.2.3) with respect to some norm. These questions are considered in the 
geometry of numbers. Siegel (cf. [Sie29], [Fel82]) has shown that the system 
of linear equations 


Qy121 +++: + dintn = 0 (i=1,...,m) 
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with m > m in which the integers aj; are bounded by B has a non- 
trivial integral solution with coordinates bounded by 1 + (nB)"/("-™, If 
the rows of A = (a;;) are linearly independent and d denotes the gcd of 
the minors of order m of A, one can obtain the more precise upper bound 
(d-!,/det(AtA))'/("-™ , This estimate and its generalization to algebraic 
number fields was proved by Bombieri and Vaaler (cf. [BV83]) using fairly 
subtle results from geometric number theory (Minkowski’s theory of the suc- 
cessive minima of quadratic forms [Cas59al). 

For applications, it is essential to develop efficient methods for finding 
solutions of a linear Diophantine system with non-negative coordinates. This 
is the central problem of integral linear programming. It belongs to the class of 
intractable problems i.e. those for which polynomial algorithms are not known. 
The intractability of the knapsack problem has been used in cryptography (see 
Ch.2). It consists of finding a solution of the equation a,21 + +--+ 4,2, = 0 
with x; € {0,1} where a;, b are given integers (see [Kob94], [LeH.84]). 


1.2.3 Equations of Degree Two 


Consider the following Diophantine equation with integral coefficients 
n n 
f(@1,22,.--,2n) = So ayviay t+ So bai te= 0. (1.2.6) 
ij i=1 


Here we shall begin by finding the set of all rational solutions, which is easier 
than finding the integral solutions but far from trivial. 

A classical example is furnished by the rational parametrization of the 
circle x? +y? =1: 


2t 1-¢ 
f= — 
fe 14+? 


(x =cosy, y=sing, t=tan(*)). (1.2.7) 


This parametrization allows us in turn to describe all primitive Pythagorean 
triples (X,Y, Z), that is, natural solutions of X?+Y? = Z? with gcd( X, Y, Z ) 
= 1. The answer is: X¥ = 2uv, Y = u?—v?, Z = u? + v?, where u > v > 0 are 
relatively prime integers. To prove this it suffices to put ¢t = w/v in (1.2.7). 

Similarly, finding rational solutions to (1.2.6) is equivalent to finding inte- 
gral solutions to the homogeneous equation 


Pi Ko Xie Xn) = D> FANG 
i,j=0 


= > figXiXj +2 S~ fioXiXo + fooXs (1.2.8) 
ij=l ij=l 


where fig = fii = ay; /2 forl<i< gon and foi = fio = b;/2 fori = 
1,2,...,, foo = c. The non-homogeneous coordinates ©1,...,£p are related 
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to the homogeneous coordinates Xo,...,Xn by X; =a@jXo (i =1,2,...,n). 
The quadratic form F(X) can be conveniently written as 


F(X) = X*ApX, Ke = (Nop Aagries Xn) 


where Ap = (fj;) is the matrix of coefficients. If there exists a non-trivial 
integral solution to F(X) = 0 we say that F represents zero over Z. This 
equation defines a quadric Qr. Its points are all complex solutions (except 
the trivial one) considered as points in the complex projective space CP”: 


Qr ={(zo: 41 i+++t Zn) ECP” | F(z, 21,---;2n) = Of. 


Any non-trivial rational solution of F'(X) = 0 gives a point on this quadric. 
If we know one solution Xo then we can find all the others by considering 
intersections of Qr with the (projective) lines defined over Q and containing 
Xo. Algebraically, a line passing through X° and Y° consists of all points 
uX°+vY°. The equation F(uX° + vY°) = 0 reduces to 


“ OF 
uv > ax XY +v°F(Y°) =0. 
i=1 
(e) 


In general, not all the partial derivatives oe vanish at X°. If this is the case, 


then for any Y° we can find an intersection point of Qr with our line: 


n 


v=-uy> > (RYOTE). (1.2.9) 


i=l 


(If by chance F(Y°) = 0 then Y° is already on Qy-). Again, this point will in 
general be unique. Limiting cases can be well understood in geometric terms: 
if all partial derivatives vanish at X° then our quadric is a cone with vertex 
X°, and the problem is reduced to that of finding rational points on the base 
of the cone, this base being a quadric of lower dimension; if a line happens 
to lie entirely on QF then all its rational points should be taken into account 
etc. 

This stereographic projection method, applied to x? + y? = 1 and the point 
(0,-1) gives precisely (1.2.7) if one denotes by t a coefficient of the equation 
of the line passing through (0,-1) and (#,y):  y+1= tz. 

Considering the equation 


FUNG Mii) SO (1.2.10) 


(with Fas in (1.2.8)) over the rationals, we could alternatively begin by 
diagonalizing F’ by a non-degenerate linear substitution X = CY where C' € 
M,+1(Q). The matrix C can be found effectively by Lagrange’s method of 
successively completing the squares. The previous geometric analysis then 
becomes quite transparent. 
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Fig. 1.3. 


For homogeneous equations such as (1.2.10) the problems of finding solu- 
tions in Q and in Z are essentially equivalent. Since we can find all solutions 
starting from one of them, the key question is that of deciding whether there 
is one. An answer is given by the following result. 


1.2.4 The Minkowski—Hasse Principle for Quadratic Forms 


Theorem 1.2. A quadratic form F(a1,%2,...,%) of rank n with integral 
coefficients represents zero over the rationals iff for any N, the congruence 
F(a1,...,2n) = 0 (mod N) has a primitive solution and in addition F' rep- 
resents zero over the reals, 1.e. it is indefinite. 


For a general proof see [BS85], [Cas78]. Of course, the necessity of this 
condition is obvious. 

We reproduce here the beautiful proof of sufficiency in the case n = 3 due 
to Legendre ( [BS85], [Ire82]). Let 


F = a2} + apr} + 0323 (a,a2a3 4 0). 


Since F’ is indefinite we may assume that the first two coefficients are 
positive while the third one is negative. Furthermore, we can and will assume 
that they are square-free and relatively prime: this may be achieved by obvious 
changes of variables and by dividing the form by the gcd of its coefficients. 
Denote the form with such properties by 


ax® + by? — cz”. (1.2.11) 
Consider a prime p dividing c. Since F = 0(mod p) has a primitive solution, 


we can find a non-trivial solution (xo, yo) to the congruence ax? + by? = 
O(mod p). Therefore 
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ax? + by? = ayo *(xyo + yxo)(xyo — yxo) (mod p). 


For p = 2 we clearly have 


ax? + by® — cz? = (ax + by — cz)” (mod 2). 


Hence for all p|2abc we can find linear forms L®), M®) of x,y, z with integral 
coefficients such that F = L®) M)(mod p). Using the Chinese Remainder 
Theorem, we find L (resp. /) with integral coefficients congruent to those of 
L®) (resp. M“)) (mod p) for all plabc. We then have 


ax? + by? + cz” = L(a,y, z)M (a, y, z) (mod abc). (1.2.12) 
Consider now the integral points in the box 
0<a<vVbe, 0<y< Vac, 0<2z< Vab. (1.2.13) 


If we exclude the trivial case a = b = c= 1, not all square roots are integers 
so that the total number of points will exceed the volume of this box which 
is abc. Hence there are two different points where L takes the same value 
mod abc. Taking their difference, we find 


L(x, Yo, 20) = 0 (mod abc) (1.2.14) 
for some |x| < Vbc, |yo| < Vac, |z0| < Vab. Hence 
axe + by§ — cz@ =0 (mod abc) (1.2.15) 


and 
—abe < axe + by? — cz% < 2abe. 


It follows that either 
axe + bys — cz =0 (1.2.16) 
or 
axe, + by? — cz = abc. (1.2.17) 


In the first case the theorem is proved. In the second case we obtain the 
following non-trivial solution 


a(xo20 + byo)? + b(yoz — aro)” c(z + ab)? = 0. 


Legendre’s original statement is that ax? + by? — cz? = 0 is solvable iff all the 
residue classes bc (mod a), ac(mod b),  —ab (mod c) are squares. 

One can prove that an indefinite quadratic form of rank > 5 always repre- 
sents zero over the rationals. For smaller rank, the Minkowski-Hasse principle 
can be combined with an a priori minimization of the moduli to be tested 
to give an effective way of establishing the existence of a solution. Below we 
shall reformulate this approach using the more convenient language of p-adic 
numbers (cf. Part II Chap. 4 §4.2.5, 4.3.1, Chap. 5 §5.3.6). 
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1.2.5 Pell’s Equation 


For non—homogeneous problems, the difference between rational and integral 
solutions becomes essential. For example, consider Pell’s equation 


z* — dy? =1, (1.2.18) 


where d is a positive integer (and not a square). Since we know one trivial 
rational solution (1,0) the others can be easily found by the method described 
above. However, to obtain only integral solutions we must act in a totally 
different way. 

First of all, assume that the set of non-trivial integral solutions is non— 
empty (in fact, this can be proved by various methods). It is sufficient to 
consider only solutions with positive coordinates. We shall call such a solution 
(21,41) minimal if the linear form 2 + Vdy takes its minimal value on it. This 
solution is unique since Vd is irrational. The central result of the theory of 
Pell’s equation states that all solutions are of the form (+2,,+yn) where 
tn + V dyn = (a1 + JV dy)", n being an arbitrary non-negative integer. 

The most natural proof, which admits a far-reaching generalization, is 
based on studying the quadratic field K = Q(Vd) = {a+bVd | a,b € Q}. The 
set A= Z+ZvVdisa subring in K. The norm of a= a+ bVd is by definition 


N(a) = Nx /9(a) = a? — db’. 
Clearly, 
N(a’) = N(a)N(G) (1.2.19) 


for all a, 8 € K. Solutions of Pell’s equation are numbers in @ with norm 1. 
From (1.2.19) it follows that they form a group (with multiplication as the 
group law), in which the positive elements form the cyclic subgroup generated 
by #1 + yiVd. 

In classical papers several methods were suggested for finding the minimal 
solution, or at least some solution. One of these algorithms is based on approx- 
imation theory (cf. §4 below). Dirichlet in 1837 published explicit formulae 
giving some solutions of Pell’s equation expressed through trigonometric func- 
tions. For example, for d = 13 his general formulae show that 21 +y1/V13 = 7 


where i é 
sin 4 sin 23 sin 2 — 
— 38 28 € Q(v13) 
sin 13 sin 13 sin 13 
(cf. [Dir68], [BS85], [Maz83] ). In 1863 Kronecker published an expression for 
r1+yiVd via special values of elliptic functions (cf. [Kr1863], [Sie65], [Wei76]). 
Finally, it is worth mentioning that a general quadratic Diophantine equa- 
tion in two variables over the integers may be reduced by linear substitutions 
to a Pell-like equation if one solution is known. 
A solution of Pell’s equation using continued fractions is described in 


81.4.5. 
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1.2.6 Representation of Integers and Quadratic Forms by 
Quadratic Forms 


Consider two quadratic forms with integral coefficients 


f(x) = f(@1,---,2n) = ys ij40j0; = Ala] = 2 Aa, 


ij=l 


gy) =9(415---Ym) = D> bigyiys = Bly] = y'By, 
ij=l 


where A and B are symmetric matrices. We shall say that f represents g over 
Z if for some C € My, »(Z) we have 


f(Cy) = gly), or, equivalently A[C] = B. (1.2.20) 


In particular, for m = 1 and g(y) = by’, f represents g iff f(c1,...,¢n) = b 
for some integers ¢),...,Cn- 

Pell’s equation considered above is a special case of the much more difficult 
general problem of representing integers and quadratic forms by quadratic 
forms. We shall sketch below some results and approaches to this vast domain. 

Lagrange proved that every positive integer is a sum of four squares. A 
more difficult result due to Gauss states that b > 0 is a sum of three integer 
squares iff it is not of the form 4*(81—1), k,l € Z. Lagrange’s theorem can 
be easily deduced from this fact (cf. [Se70], [Cas78]). 

Put 


ry(n) = Card{(n1,..., ne) € Z* | n?+--- +n? =n}. (1.2.21) 


For example, r2(5) = 8, as one may convince oneself by listing all solutions. 
There exist many formulae for this arithmetical function (cf. a vast bibliog- 
raphy in [Kog71]). Most of them are descendants of the classical formula of 
Jacobi ([Mum83], [Se70], [And76]): 


8S od, if nis odd, 
d\n 
r(™)=Yoa S~ d, if nis even. (1.2.22) 
d|n 
d=1(2) 


The proof is based on a study of the generating function for the sequence 
rz(n), that is, the series 


[eo e) 
Sore(ng?= So grater = a(r)é 
n=0 (n1,...,2~)EZ* 


where 
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O(T) = S- qr’, ce. (1.2.23) 
nez 


This theta—function is a holomorphic function on the complex upper half- 
plane H = {tr € C | Im(r) > 0}. It has many remarkable analytic properties. 
They can be summarized by saying that 04(7) is a modular form of weight 2 
with respect to the group Io(4) where 


T(N) = 6 ? € SL(2,Z) 


we} , (1.2.24) 


This means that the holomorphic differential 6+(7)dr is invariant with respect 
' : in I(4). 
Modular functions will be considered more systematically in Part II, Ch. 6, 
86.3. The space of all such differentials is two-dimensional, and one can con- 
struct a basis of this space with the help of Eisenstein series whose Fourier 
coefficients are more or less by construction certain divisor sums. Examin- 
ing the first two coefficients of the series one finds an expression for 64(7) as 
a linear combination of the Eisenstein series. On comparing coefficients one 
obtains (1.2.22). This method is very general. When the number of squares 
grows one has to take into account not only the Eisenstein series but also cusp 
forms whose Fourier coefficients have a more complicated arithmetical nature 
but in many cases allow a non-trivial direct interpretation. If one manages to 
construct an explicit basis of the relevant modular forms, one can then express 


to the substitutions T +> (at + b)(cr + d)~+ for every matrix 


the theta-series of a quadratic form f(x1,...,%,%) = A[z] with respect to this 
basis 
Ors f) = D7 elf (a)r) = Do r( finda” (1.2.25) 
xeZk n=0 
where 


e(T) = exp(27ir) = q, 
r(fjn) = Card{z € Z* | f(x) =n}. 


This theta-series is a modular form of the weight k/2 with respect to a con- 
gruence subgroup of the modular group. 

For a recent progress by G.Shimura on the representation of integers as 
sums of squares, we refer to [Shi02], [Shi04]. 

We quote as an example a formula proved by A.N.Andrianov ([An65], 
[Fom77]). Let f = x? + y? + 9(z7 + t?). The theta-series of this form is a 
modular form of weight 2 w.r.t. [9(36). For any prime p 4 2,3 we have 


p-1 3 
r(fjp) = (0 + 1) sD +*) (1.2.26) 


x=0 
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where the sum in the right hand side contains the Legendre symbols, cf. §1.1.4. 
Generating functions are traditionally used in combinatorics and the the- 


ory of partitions. The simple partitions of n into sums of non-increasing nat- 
ural summands are counted by the partition function p(n): 


p)=1 =1, 

pi2)=2 : =2, 141; 

pi3)=3 : 3=3, 241, 14141; 
p(4)=5; p(5)=7 


Its generating function satisfies the Euler identity (cf. [Cha70], [And76]): for 
lq] < 1 one has 


(l—q™)?. (1.2.27) 


5 


1+ 5° p(n)q" = 


To prove this, it suffices to represent the r.h.s. as the product of the power 
series and to notice that p(n) is the number of solutions of a linear Diophantine 
equation with an infinite set of non-negative indeterminates 


m=1 


a, + 2a2 + 8a3 +--+ = 7. 


Remarkably, the theta-series of certain quadratic forms are also connected 
with certain infinite products similar to (1.2.27). For example, if |g] <1, z 40 
we have (cf. [And76]) 


Co 


‘S zhqr - I (1 _ gyi 4 er eaikaaea 16 fs prlgeers) (Jacobi), 
m=0 


n=—Co 


S- grint/2 — I] (a age) (Gauss). 
n=0 m=1 


These identities follow from a more general result of Cauchy, valid for |q| < 
1, jt} <1, a€C: 


= (1 — a)( )(1 — qa)... (1 —ag"™")t™ ead (1 — atg™) 
a (1 — g)(1— q?)...(1-q”) ge Gero (1.2.28) 


Recently this list of such identities has been greatly enlarged, thanks to the 
discovery that they are connected with the representation theory of the simple 
Lie algebras, root systems and finite simple groups (cf. [Mac80]). 
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An impressive example of the use of generating functions is given by 
Borcherds [Borch92] in his proof of the Conway and Norton conjectures con- 
cerning connections between the monster simple group, M (and also other fi- 
nite sporadic simple groups), and modular functions. This group is the largest 
sporadic finite simple group, its order is 


8080, 17424, 79451, 28758, 86459, 90496, 17107, 57005, 75436, 80000, 00000 = 
246 £320.59. 76.44? 133 .17-19-23-29-31-41-47-59- 71. 


The degree of the smallest nontrivial irreducible complex representation 
of M is 196883, which is 1 less than the first nontrivial q coefficient of the 
famous j(q) or elliptic modular function. In fact 


j(q) = q7' + 196884q + 214937609? +... 


and the other coefficients of 7 turn out to be simple combinations of the 
degrees (traces on the identity) of representations of M. 

Conway and Norton conjectured in [CoNo79] that the functions j,(q) ob- 
tained by replacing the traces on the identity by the traces on other elements g 
of M are “genus zero” modular functions. In other words if Gy is the subgroup 
of SL2(R) which fixes j,(q), then the quotient of the upper half of the complex 
plane by G, is a sphere with a finite number of points removed corresponding 
to the cusps of Go, cf. 86.3. 

The proof is just as remarkable as the original moonshine conjectures and 
involves the theory of vertex operator algebras and generalized Kac-Moody 
algebras, cf. [Kon96]. 


It turns out, that some questions of the quantum field theory are related 
with modularity properties of such q-expansions, cf. [DGM90]. For example, 
this property is an open question for the g-expansions: 


> geX AX+BX +e 
xenn (Qala e. 
24>0 


where X = (age , Xn) € Zz”, ne 1, (qm = (1 _ ga Pes g? )evee(1 a ag); 
AéeM,(Q), BE QM, c€ Q (private communication by H.Cohen). 


Symmetry properties of generating functions were used in Wiles’ proof 
of Fermat’s Last Theorem and of the Shimura~Taniyama—Weil Conjecture 
(see [Wi], [Ta-Wi], [DDT97] and Chapter 7). In this truly marvelous proof, 
a traditional argument of reductio ad absurdum is presented in the following 
form: if a? + b? = cP, abc £ 0, for a prime p > 5 (a primitive solution (a, b, c)), 
then one can associate to (a,b,c) a certain generating function f = fa,bic : 
H — C on the Poincaré upper half-plane H, defined by a Fourier series with 
the first coefficient equal to 1, as explained in 87.1. It turns out that this 
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function has too many symmetries, expressing the fact that it is a modular 
cusp form of weight 2 and level 2, and forcing f = 0 by §6.3, a contradiction 
with the construction of f. 


1.2.7 Analytic Methods 


Generating functions are also used to obtain various asymptotic formulae for 

functions like r(f;n) and p(n) as n — oo. In particular, many results have 

been derived using the Hardy-Littlewood circle method , its variants and gen- 

eralizations (cf. [Vin52], [Vin71], [VK], [Mal62], [HW81], [Vau81-97], [Des90]). 
The application of this method to a generating function 


F(t) = Jalna” (q= e(r) = exp(2zir)) 


n=0 
starts with Cauchy’s formula: 


a(n) = Ae F(r)q-™ ‘dq. (1.2.29) 


2m ljq|=r<1 


The following discussion can be efficiently applied to many situations when 
the unit circle is the natural boundary for the function F(7) and roots of 
unity on this boundary behave as “the worst essential singularities” (to get 
some feeling for this, look at the r.h.s. of (1.2.27)). The idea is to break the 
integration domain into two parts: J; (the contribution of roots of unity of 
comparatively small degree) and Jj (everything else) and to attempt to prove 
that [> is much smaller than J;. To understand the asymptotic behaviour of J; 
and to majorize Iz one often uses exact or approximate functional equations 
for F(7), Poisson summation etc. 

For example, to estimate p(n), Hardy, Littlewood and Ramanujan put 
r = e72"/”” Tn terms of 7, they integrated over the segment L, = {rT = 
x+iy|0<a<1, y=1/n?}, which they divided up as follows: J, is the union 
of the pairwise disjoint segments 3,4 = {x | |e—p/q| < 1/2qn® (6 > 1)} where 
p/qruns through the rational numbers between 0 and 1 with denominator < n; 
Iz is the complement of [,. 

For (1.2.27) this furnishes the Hardy-Ramanujan asymptotic formula 


oe 
p(n) = Va + O(e**" /X3), 
where 
An = V/n—1/24, K =7/2/3 
(cf. [Cha70]). 


Later this method was perfected by K.Rademacher who gave an exact 
formula for p(n) as an infinite sum whose summands correspond to (all) roots 
of unity. 
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In one of the applications of the circle method to the theory of quadratic 
forms, A.V.Malyshev proved in [Mal62] the following general result. Let k > 4, 
f(x1,...,2,) a positive quadratic form with integral coefficients and determi- 
nant d. Then as n — oo we have 


ak /2 of 

rT ;w= —_________ 7,27 A 7m + O qk+12)/8,)(k-1)/4+6€ ‘ 

(Fin) = aepayrt HUFm) + OC ) 
Here the constant in O depends only on k and « > 0 and A(f;n) is the so 
called singular series. This series is obtained in the process of computing of 
the contribution of J, as an infinite product over all primes including the 
‘Gnfinite prime”: 

H(fn) = roo(fin) |] ro(fin), 
P 


where 
rp(fin) = lim p-™*-YCard{x € (Z/p™Z)* | f(z) =0 mod p™} 


and roo(f;n) is a certain “real density” of the solutions of f(x) =n. 

It follows that if n is sufficiently large and is representable by f modulo 
all prime powers, then it is representable by f. This method however does 
not work for 2 or 3 variables, where more subtle approaches are needed (cf. 
[Lin79], [GF77], [Fom77], [Lom78]). 

The circle method was considerably modified and perfected by I.M.Vi- 
nogradov (cf. [Vin52], [Vin71], [VK]), who suggested replacing generating 
functions by exponential sums, which are essentially their partial sums re- 
stricted to the unit circle, e.g. 


On(if)= >> e(f(a)r). (1.2.30) 


As a function of the real variable 7 this sum oscillates vigorously and has 
local maxima (of its modulus, real, and imaginary parts) at rational numbers 
with small denominators. This behaviour reflects the singular behaviour of the 
generating function in the vicinity of its natural boundary but is much less 
wild and more easily controllable. This is one of the reasons for the success of 
Vinogradov’s method. 

Figures 1.4 and 1.5 show the (scaled) graphs of the two simplest exponen- 
tial sums featuring this behaviour. Instead of Cauchy’s formula (1.2.29), one 
uses in Vinogradov’s method the integral formula 


| On(7; f)e(—nr)dr = Card{x € Z* | f(x) =n, |z;| < .N} (1.2.31) 


which follows directly from the orthogonality of the basic exponential func- 
tions. 
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Fig. 1.4. y(r) = ae cos(2m <r) 


Vinogradov’s version of the circle method enabled him to prove that every 
large odd integer is a sum of three primes (Goldbach conjectured in 1742 
that every even integer is a sum of two primes) and to considerably diminish 
the number of summands in Waring’s problem (1770) on the representation 
of large integers as sums of k-th powers. An improvement on Vinogradov’s 
bound due to Karacuba, [Kar85] is k(2logk + 2loglogk + 12). Interesting 
results for G(k) asymptotic to k log & has been obtained by R.C.Vaughan and 
T.D.Wooley (cf. [VaWo91], [VaWoIV]). Further details of analytic methods 
are outside the scope of this report and we refer the interested readers to the 
monographs [Vin71], [VK], [Pos71], [AKCh87], [HW81], [Cha70], [Vau81-97] 
and others. We should mention only the wide applicability of formulae of the 
type (1.2.31) counting various numbers of solutions and the important role of 
exponential sums like (1.2.30) in arithmetical problems (Gauss sums, Jacobi 
sums, Kloostermann sums etc., cf. Ch.2, §2.2). 

More generally, harmonic analysis is now used in number theory in its non— 
commutative and multi-dimensional versions. For example, the construction 
of the Hecke basis in the space of modular forms which is orthonormal with 
respect to the Petersson inner product (scalar product) can be considered as a 
two-dimensional analogue of the orthogonality relations for the exponentials 
mentioned above (Part II, Ch. 6, §6.3). 


1.2.8 Equivalence of Binary Quadratic Forms 


Two quadratic forms over the integers f, g are called equivalent (over Z) if 
they represent each other (cf. §1.2.6). We shall denote a binary quadratic form 
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Fig. 1.5. y(7) = ee cos(2r #7) 


f(a, y) = Az? + Bry+Cz2? also (A, B,C). Such a form is called primitive if A, 
B and C have no common factor. Its discriminant is denoted A = B? —4AC. 
Two forms f and g are called properly equivalent if we have 


f(x,y) = g(ma + ny, kx + ly) 


for an appropriate matrix 


@ ) = SEs(Z): 


Gauss founded the equivalence theory of binary quadratic forms. He proved 
that if A is not a square, then the set Cl(A) of proper equivalence classes of 
forms with discriminant A can be made into a finite Abelian group with re- 
spect to a natural composition law. (Actually, this was one of the first abstract 
Abelian groups discovered in number theory). Very recently M.Bhargava (a 
PhD student of A.Wiles, cf. [Bha04]) found higher composition laws, giving a 
new view on Gauss composition. 

In order to define this composition law in modern terms, consider the 
quadratic number field K = Q(WA) = Q(Vd) = {x + yVd | x,y € Q} where 
d is a square-free integer. We have A = Dc? where D is the discriminant of 
the quadratic field K, D = dif d= 1mod 4 and D = 4d otherwise. An 
element a = x + yVd € K is called an integer if its trace 2x and its norm 
x” — dy? are integers. The set of all integers in K forms a ring 


O=(1,w) ={m+nw|m,ne Z}, 
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where w = Vd if d = 2,3mod 4 and w = (14+ Vd)/2 if d = 1mod 4. 
For any integer c we can define a subring O. = Z+cO =< lw >. A 
fractional ideal M in O, is a free additive subgroup with two generators which 
is stable with respect to multiplication by elements of O,. The product of two 
fractional ideals is, by definition, the subgroup generated by the products 
of elements from one ideal with elements of the other. The fractional ideals 
form an Abelian group with identity O.. To each such ideal M corresponds 
a quadratic form with discriminant D = dc? which can be constructed as 
follows. Define the norm of M by N(M) = Card(Q,./M). Choose a basis 
{a, 3} for M in such a way that y = —3/a = « + yVd satisfies the condition 
y > 0. Then the quadratic form in question is 


2 2 _ N(ax+ By) 
f(a, y) = Av" + Baryt+ Cy* = — NM) 
One can check that this is a primitive integral form. 

Two fractional ideals M and M, are called equivalent in the narrow sense 
if M = yM, for some y € K with positive norm. The equivalence classes 
of fractional ideals correspond bijectively to the proper equivalence classes of 
primitive binary forms of discriminant Dc?. Multiplication of the fractional 
ideals induces a group structure on this set. The identity of this group is 
represented by the quadratic form (1,0, -—A/4) (resp. (1,1, (1 — A)/4) if A is 
even (resp. odd). In computations it is convenient to work with the reduced 
forms (A,B,C) for which A > 0, -A< B< A, ged(A, B,C) =1 1 A< 
0 then the group Cl(A) is trivial exactly for the following values: -—A = 
4,8,3,7, 11, 19,43,67,163 (c = 1); 16,12,28 (c = 2); 27 (c = 3) (cf. Part II, 
Ch. 5, 85.4.1). 


1.3 Cubic Diophantine Equations 


1.3.1 The Problem of the Existence of a Solution 


For cubic forms F(X,Y,Z) in three variables with integral coefficients, no- 
body has succeeded in devising a general algorithm which provably decides 
whether the equation F' = 0 has a non-trivial integral solution. Large classes 
of such equations have been studied both theoretically and numerically; see for 
example the early influential papers by E.S.Selmer (cf. [Selm51] and [Selm54]) 
devoted to the equations 


aX? 4+ bY? +cZ> =0. 


Even some of the simplest equations like 3X° + 4Y°? + 5Z% = 0 fail to satisfy 
the Minkowski-Hasse principle: they have no non-trivial integral solutions 
although they do have both real solutions and primitive integral solutions 
modulo any N > 1. The degree of such failure can be measured quantitatively 
by the Shafarevich-Tate group: cf. 85.3. 

D.R.Heath-Brown has shown (cf. [HB84]) that any non-singular cubic 
form in ten variables represents zero non-trivially, and C.Hooley in [H88] 
has established the Minkowski-Hasse principle for non-singular nonary cubic 
forms (a form is called non-singular if it and all its first partial derivatives 
have no common non-trivial complex zeroes). Previously Davenport and Birch 
had shown that there exist non-singular cubic forms in nine variables which 
do not represent zero modulo a power of every prime. 

Birch in [Bir61] established that forms of any odd degree d represent zero 
if the number of variables is sufficiently large (with the bound depending 
only on d). These results have since been generalized, extended and made 
more precise by several authors. They are proved by the circle method, cf. 
[Vau81-97], [Des90]. 


1.3.2 Addition of Points on a Cubic Curve 


Any ternary cubic form F'(X,Y,Z) defines a cubic curve C in the complex 
projective plane P?: 


C={(X:Y:Z) | F(X, Y,Z) =0}. (1.3.1) 


If C (that is, F’) is non-singular, and if F = 0 has at least one rational 
solution, then one can find a non-degenerate change of projective coordinates 
with rational coefficients which reduces F to a Weierstrass normal form 


¥°Z-—X*-axXZ’-bZ° (a, bE Q). (3.2) 


One may also assume that the initial solution becomes the obvious solution 
(0: 1:0) of (1.3.2). The non-singularity condition for (1.3.2) is equivalent to 
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the non-vanishing of the discriminant 4a? + 27b?. Non-singular cubic curves 
are also called elliptic. Passing to non—homogeneous coordinates « = X/Z,y = 
Y/Z we reduce F' = 0 to the form 


y =z? +ar+b, (1.3.3) 


where the cubic polynomial in the r.h.s. has no multiple roots. In this affine 
form, our initial solution becomes the infinite point O. There is a beautiful 
geometric description of a composition law on the set of rational points of C 
making it an Abelian group with O as identity (or zero). This is called the 
secant—tangent method (cf. [Sha88], [Cas66], [Kob84]). Namely, for a given pair 
of points P,Q € C(Q), we first draw a line containing them both. This line 
also intersects C at a well-defined third rational point P’. Now we again draw 
a line through P’ and O. Its third intersection point with C is, by definition, 
the sum P+ Q. If P= Q, the first line to be drawn should of course touch C 
at P. 


Fig. 1.6. 
Fig. 1.7. 


Calculating in non—homogeneous coordinates P = (#1, y1), Q = (2, y2) 
one finds P + Q = (x3, y3) where 


2 
_f YT 2 

03 = —X1 — 1274 ) 
XL — 2 


— ahve (a1 x3) Y1- (1.3.4) 
v1 — £2 


¥3 


In the limit case P = Q we have instead 


3a7 +4 3227 +4 
r3 = 22,4 L ,¥=— (x1 — £3) — y1. (1.3.5) 
2y1 2y1 
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If x, = x2 and y; = —y2 then P+ Q = O, the infinite point which is zero for 
the group law. 

This method allows us to construct new rational points starting with some 
known ones. They will be the elements of the group generated by the initial 
points, e.g. mP, m € Z, if just one point P (except O) was found initially. 

For singular cubic curves this construction fails. For example, consider the 
curve 


C: yaa? +23, (1.3.6) 


which is drawn in Fig. 1.8. Any line passing through (0,0) has only one more 
common point with C: on y = tz it is defined by the equation #?(t?—x—1) = 0. 
Besides the trivial solution 2 = 0, we obtain 2 = t?—1 and y = t(t?—1) so that 
we have found all points on C with the help of a rational parameterization. 
In the non-singular case no such parameterization exists. On the other hand, 
in our example we could have still defined the group law on the set of non— 
singular points as above. However, this becomes simply multiplication (for a 
suitably chosen rational parameterization). 


Fig. 1.8. 


A curve admitting a rational parameterization is called rational. How one 
can establish that such a parameterization exists or otherwise, and how its 
existence influences the problem of describing all rational points, is answered 
by algebraic-geometric methods. 


1.3.3 The Structure of the Group of Rational Points of a 
Non-Singular Cubic Curve 


The most remarkable qualitative feature of the secant-tangent method is that 
it allows one to construct all rational solutions of a non-singular cubic equa- 


1.3 Cubic Diophantine Equations 41 


tion (1.3.3) starting with only a finite number of them. In group-theoretical 
language, the following result is true. 


Theorem 1.3 (Mordell’s Theorem). The Abelian group C(Q) is finitely 
generated. 


(cf. ([Mor22], [Cas66], [Mor69], [La83], [Se97] and Appendix by Yu.Manin to 
[Mum74]). From the structure theorem for finitely generated Abelian groups, 
it follows that 

C(Q)2AxZ 


where A is a finite subgroup consisting of all torsion points, and Z” is a 
product of r copies of an infinite cyclic group. The number r is called the rank 
of C over Q. 

The group A can be found effectively. For example, Nagell and Lutz (cf. 
[Lu37]) proved that torsion points on a curve y? = 2° + az + b for which a 
and 6 are integers, have integral coordinates. Furthermore, the y—coordinate 
of a torsion point either vanishes or divides D = —4a® — 270?. 

B.Mazur proved in 1976 that the torsion subgroup A over Q can only be 
isomorphic to one of the following fifteen groups: 


Z/mZ (m <10,m = 12), Z/2Z x Z/2nZ (n < 4), (1.3.7) 


and all these groups occur, cf. [Maz77]. 

It is still an open question whether r can be arbitrarily large. Mestre (cf. 
[Me82]) constructed examples of curves whose ranks are at least 14. *) 

A comparatively simple example of a curve of rank > 9 is also given there: 
y? + 9767y = x3? + 3576a? + 4252 — 2412. One can conjecture that rank is 
unbounded. B. Mazur (cf. [Maz86]) connects this conjecture with Siluerman’s 
conjecture (cf. [Silv86]) that for any natural k there exists a cube-free integer 
which can be expressed as a sum of two cubes in more than k ways. 


Examples. 1) Let C be given by the equation 
yty=2?—2 
whose integer solutions list all cases when a product of two consecutive integers 
equals a product of three consecutive integers. Here A is trivial while the free 
part of C(Q) is cyclic, with a generator P = (0,0). Points mP (labeled by m) 
are shown in Figure 9. 
The following Table 1.3, reproduced here from [Maz86] with Mazur’s kind 


permission, shows the absolute values of the X—coordinates of points mP, for 
even m between 8 and 58. 


* Martin—Mcmillen (2000) found an elliptic curve of rank > 24: 


y? tay + y = 2° — 120039822036992245303534619191166796374x 
+504224992484910670010801799168082726759443756222911415116 


(see http://www.math.hr/~duje/tors/rankhist.html for more examples). 
footnote by Yu.Tschinkel). 
y 
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Fig. 1.9. 


Table 1.3. 
20 
116 
3741 
8385 
239785 
59997896 
18490337896 
270896443865 
16683000076735 
2786836257692691 
314892968 1285740316 
342115756927607927420 
28025 112992256329 1422645 
8042875 18035141565236193151 
743043 134297049053529252783151 
3239336802390544740129153150480400 
261339025 24580 143443694240 1 26 13679600 
1251873709467 1 23982668303 1943583152550351 
596929565407758846078 157850477988229836340351 
238585858632982963 1608077553938 139264431352010155 
561860540 1843475352702275238228029 1882048809582857380 
23897505 191109 140186309909376606354352699564527 70356625916 
65008789078766455 2756007507 1 13064937939959207504295469 12218291 
86338 150358868067 1392136 12634565727407840380659 | 76743 159137754 17535 
132767834389488863 12588030404 44 14443 134057555343662544 164328809240 19065 
593076041546964 265894895676 | 7397943244827 29234687 1 145 12318727773285587667 1389 


One sees that the last figures lie approximately on a parabola. This is 
not an accident, but a reflection of the quadratic nature of heights on elliptic 
curves (cf. below). 

2) Table 1.4 was kindly calculated for this edition by H.Cohen, using PARI 
computing system, [BBBCO]. This table lists ranks r and generators for curves 
X34 Y? = AZ? with natural cube-free A < 500; it corrects and completes 
the Tables of Selmer (cf. [Selm51], [Selm54]) which were reproduced in the 
first edition [Ma-Pa]. Note the 3 missing values A = 346,382,445 for which 
H.Cohen proved that r = 1, but the method of Heegner points for computing 
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generators (see §6.4.4) takes too much time. However, this computation was 


completed by Ch. Delaunay, see Table 1.5. 


Table 1.4. Number of generators r and basic solutions of X* + Y° = AZ? with A 
cube-free, A < 500. 


A |r|(X,Y, Z) A |r|(X,Y,Z) 
6|1| (37, 17, 21) 94]1|(15642626656646177, 
7\1\(2, -1, 1) -15616184186396177, 
9}1](2, 1, 1) 590736058375050) 
12]1](89, 19, 39) 97|1|(14, -5, 3) 
13]1|(7, 2, 3) 98]1|(5, -3, 1) 
15]1|(683, 397, 294) 103]1]|(592, -349, 117) 
17 8, -1, 7) 105/1|(4033, 3527, 1014) 
19]2| (36, -17, 13),(109, -90, 31) 106]1] (165889, -140131, 25767) 
20 9, 1, 7) 107/1|(90, 17, 19) 
22|1] (25469, 17299, 9954) 110/2](181, -71, 37),(629, 251, 
26}1](3, -1, 1) 134) 
28]1](3, 1, 1) 114]1|(9109, -901, 1878) 
30]2] (163, 107, 57),(289, -19, 115/1| (5266097, -2741617, 
93) 1029364) 
31 37, -65, 42) 117/1](5, -2, 1) 
33 853, 523, 582) 1231] (184223499139, 
34]1](631, -359, 182) 10183412861, 
35]1](3, 2, 1) 37045412880) 
37|2](4, -3, 1),(10, -1, 3) 124]2](5, -1, 1),(479, -443, 57) 
42|1](449, -71, 129) 126/2]|(5, 1, 1),(71, -23, 14) 
43]1](7, 1, 2) 127/2](7, -6, 1),(121, -120, 7) 
49}1](11, -2, 3 130]1] (52954777, 33728183, 
50]1| (23417, -11267, 6111) 11285694) 
51]1](730511, 62641, 197028) 132]2] (2089, -901, 399),(39007, 
53]1] (1872, -1819, 217) -29503, 6342) 
58]1| (28747, -14653, 7083) 133/1](5, 2, 1) 
61|1](5, -4, 1) 134]1](9, 7, 2) 
62/1] (11, 7, 3) 139]1] (16, -7, 3) 
63]1](4, -1, 1) 140]1]| (27397, 6623, 5301) 
65/2](4, 1, 1),(191, -146, 39) 1411] (53579249, -52310249, 
67|1] (5353, 1208, 1323) 4230030) 
68] 1] (2538163, -472663, 142]1]| (2454839, 1858411, 530595) 
620505) 143]1|(73, 15, 14) 
69 5409, -10441, 3318) 151]1]|(338, -95, 63) 
70]1](53, 17, 13) 153]2|(70, -19, 13),(107, -56, 
71 97, -126, 43) 19 
75 7351, -11951, 3606) 156|1| (2627, -1223, 471) 
78]1|(5563, 53, 1302) 157|1| (19964887, -19767319, 
79 3, -4, 3) 1142148) 
84]1](433, 323, 111) 159]1| (103750849, 2269079, 
85] 1] (2570129, -2404889, 19151118) 
330498) 161|1]|(39, -16, 7) 
86|2] (13, 5, 3),(10067, -10049, 163]2|(11, -3, 2),(17, -8, 3) 
399) 164]1]| (311155001, -236283589, 
87 176498611, -907929611, 46913867) 
216266610) 166] 1] (1374582733040071, 
89]1] (53, 36, 13) -1295038816428439, 
90 241, -431, 273) 136834628063958) 
91]2](4, 3, 1),(6, -5, 1) 169/1](8, -7, 1) 
92] 1] (25903, -3547, 5733) 170|1| (26353, 14957, 5031) 
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Table 


1.4. (continued) 


A |r|(X,Y,2Z) A |r|(X,Y, Z) 

171/1| (37, 20, 7) 231]1| (818567, -369503, 
1721] (139, -103, 21) 129186) 

177|1| (2419913540753, 2331] (124253, -124020, 3589) 

1587207867247, 236|1|(248957, 209827, 47106) 

468227201520) 238|1](53927, 3907, 8703) 
178]1| (110623913, 8065063, 241/1| (292, -283, 21) 

19668222) 244|1/(99, -67, 14) 
179|1| (2184480, -1305053, 246|2)(571049, -511271, 59787), 

357833) (2043883, -1767133, 
180]1](901, 719, 183) 230685) 

182]2](11, 5, 2),(17, 1, 3) 247) 1| (20, -11, 3) 
183]2|(14, 13, 3),(295579, 249] 1] (275657307291045075203- 

-190171, 46956) 684958997, 
186]1| (56182393, 15590357, -275522784- 

9911895) 968298556737485593813, 
187]1| (336491, -149491, 57070) 4974480998065387679- 
193]1| (135477799, -116157598, 603368524) 

16825599) 251|1] (4284, -4033, 373) 
195]1| (68561, -54521, 9366) 254|2] (238013, -206263, 26465), 
197|1| (2339, -2142, 247) (238393, -222137, 21676) 
198]1] (1801, -19, 309) 258|1| (2195839, -2047231, 
201]2]|(16, 11, 3),(3251, 124, 555) 198156) 
202|1|(2884067, 257437, 491652) 259|1)(13, -5, 2) 
203]2]|(229, 32, 39),(2426, 265] 1] (36326686731109813, 

-2165, 273) 9746422253537867, 
205]1|(8191, -6551, 1094) 5691757727610864) 
206]1|(5211, -4961, 455) 267|1]|(861409, -342361, 130914) 
209]2|(52, -41, 7),(125, -26, 21) 269| 1] (800059950, -786434293, 
210|2](1387, 503, 237),(3961, 45728263) 

-2071, 633) 271|2](10, -9, 1),(487, -216, 73) 
211]1|(74167, 66458, 14925) 273|2/(19, 8, 3),(190, -163, 21) 
212]1|(337705939853, 274|1](111035496427236122887, 

-315091652237, -43257922194314055637, 

32429956428) 16751541717010945845) 
213]1](64313150142602539- 275|1| (424560439, -309086839, 

525717, 55494828) 

46732739212871- 277|1/(209, -145, 28) 

851099283, 278|1)(13, 3, 2) 

12000095230- 279]1)(7, -4, 1) 

802028099750) 282|2)(117217, -96913, 13542), 
214|1|(307277703127, (2814607, 1571057, 452772) 

-244344663377, 283] 1] (20824888493, -8780429621, 

40697090945) 3090590958) 
215|1](6, -1, 1) 284| 1] (7722630462000896449- 
217|2](6, 1, 1),(9, -8, 1) 941136589, 
218|2](7, -5, 1),(279469, -12938136226219393- 

-61469, 46270) 03367981, 
219]2]|(17, 10, 3),(168704, 1174877194362780234- 

-36053, 27897) 594343698) 
222|1|(5884597, 858653, 972855) 285|1](18989, 1531, 2886) 
223]1](509, 67, 84) 286]1)(323, -37, 49) 

228] 1| (46323521, -27319949, 287|1/(248, 121, 39) 

7024059) 289]1/(199, 90, 31) 

229]1|(745, -673, 78) 294|1] (124559, -103391, 14118) 
295|1/|(34901, -16021, 5068) 
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Table 1.4. (continued) 
A |r|(X,Y, Z) A |r|(X,Y,Z) 
301|1](382, 5, 57) 358] 1/ (7951661, 2922589, 
303]1](2659949, 67051, 396030) 1138095) 
305]1](86, -81, 7) 359] 1)(77517180, 50972869, 
306]1] (6697, -3943, 921) 11855651) 
308]1}(199, 109, 31) 363] 1) (1909159356457, 
309]2](20, 7, 3),(272540932, -1746345039913, 
-142217089, 38305371) 165073101648) 
310]1](5011613, -190493, 366] 1) (2087027, -1675277, 
740484) 228885) 
313] 1| (22, -13, 3) 367|1| (42349, 526, 5915) 
314|1| (241, -223, 21) 370|2| (7, 3, 1),(70523, 19387, 
316]1|(7, -3, 1) 9891) 
319]1](6462443919765751305- 372|1|(2717893, 630107, 379470) 
499, 373)1] (1604, -1595, 57) 
-6182025219694143- 377|1| (469, -237, 62) 
438499, 379|2| (15, -7, 2),(917, -908, 39) 
472407353310304561- 380|1|(1009, -629, 127) 
590) 382| 1 
321|1| (13755277819, 385|1| (20521, -17441, 2054) 
8670272669, 386|1](9, -7, 1) 
2164318002) 387|1| (8, -5, 1) 
322|1| (1873, 703, 278) 388] 1| (4659, -3287, 553) 
323|1| (252, 71, 37) 390|2| (3043, 467, 417),(4373, 
325]1|(128, 97, 21) -863, 597) 
330]1](1621, 1349, 273) 391] 1) (590456252061289, 
331]1](11, -10, 1) -171359229789289, 
333]1] (397, -286, 49) 80084103077160) 
335|2](7, -2, 1),(390997, 260243, 393] 1) (40454518555 13988711- 
61362) 059, 
337|1] (53750671, -53706454, 2369372172284459- 
1043511) 347309, 
339]1] (1392097139, -345604139, 587046969413536968336) 
198626610) 394] 1) (1439245403, -573627403, 
341|1|(6, 5, 1) 192088390) 
342]2|(7, -1, 1),(1253, -1205, 86) 395|1| (7891, -7851, 266) 
345|2](16543, 8297, 2454), 396] 1) (46789273, -37009657, 
(389699, -190979, 5074314) 
53292) 397|2|(12, -11, 1),(360, 37, 49) 
346|1 399|2| (22, 5, 3),(401, 328, 63) 
348]2] (40283, -15227, 5622), 402] 1) (585699417548405371, 
(2706139, 425861, 102798361240815491, 
385230) 79502362839530631) 
349] 1| (23, -14, 3) 403] 1| (53, -22, 7) 
355|1](2903959, 2617001, 407|2)(7, 4, 1),(33733, -33634, 
492516) 939) 
356|1](15026630492061476- 409] 1] (22015523, 21425758, 3687411) 
041947013, 411]1) (186871897, 49864103, 
-4709632110011335- 25292280) 
573393177, 413|1| (2575, -2103, 266) 
2098221141580681- 414] 1) (68073157, 32528843, 
446554589) 9454410) 
357|1|(19207, 6497, 2742) 418] 1|(76267, 25307, 10323) 
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1.4. (continued) 


A |r|(X,Y,Z) A |r|(X,Y,Z) 

420|2|(2213, 1567, 327),(10459, -204264638826527324- 
-6679, 1263) 892641927694862943879, 

421|1| (19690, 4699, 2639) 97368775947767167139- 

422|1](15, 1, 2) 892682703702288385) 

425|1|(2393, 1007, 326) 457|1|(41, 31, 6) 

427|1|(25, -16, 3) 458|1|(953039, -761375, 97482) 

428]1| (1294057, -1190053, 460|1| (248768189, -234795689, 
104013) 17466345) 

429]1|(16739, 14149, 2598) 462|2|(3779, 379, 489),(11969, 

430]1| (5989967, 3449393, -7811, 1389) 

841204) 463]1| (403, -394, 21) 
431|1|(701, -270, 91) 465|1| (1212356942047, 
433|2|(37, 35, 6),(252, 181, 37) -1197072217207, 
435|2|(32779, -1459, 4326), 52307828958) 

(3784049, 2981071, 466| 1| (464540708319337302841, 

570276) 88798763256715446551, 
436|2|(19, 17, 3), 60057801943830995598) 

(1667465, 307927, 220362) 467|1| (1170, -703, 139) 

438] 1] (12636764083, 468|2|(7, 5, 1),(859, -763, 74) 
11127850973, 469|2| (13, -12, 1),(26, -17, 3) 
1979215602) 474|1|(568871, -453689, 57627) 

439|1|(571, -563, 26) 477|2|(89, 70, 13), 

441|1](13, 11, 2) (12040, -11881, 523) 

444|1| (4174254535499, 481|1|(43, 29, 6) 
-726500109131, 483|1|(2401741, 
546201297768) 1945259, 

445]1 352830) 

446|2|(23, -5, 3), (4286417, 484|1| (236521, -176021, 25235) 
-4285265, 52212) 485|1|(8, -3, 1) 

447|1| (4405301, -382301, 490|1| (193229, -74159, 24039) 
576030) 493} 1] (8432715268961, 

449]|1|(323, 126, 43) -1057596310369, 

450|1|(21079, 11321, 2886) 1066758076384) 

452/1|(851498679025552429, 494|1](59, -33, 7) 
224535817897760071, 495|1| (342361, -57241, 43212) 
111626729681785675) 497|2|(55, 16, 7), 

453|2|(23, 4, 3),(50167097, (7411, -6772, 579) 
39331207, 7447188) 498|2| (611137, -490123, 60543), 

454|1| (753389202595029867- (15811001, -15250751, 933765) 
852290245746241110629, 499|1|(80968219, 17501213, 10242414) 


Table 1.5. Basic solutions of X° + Y° = AZ? with A = 346, 382, 445. 


Alr| (X,Y, Z) 

346|1| (47189035813499932580169103856786964321592777067, 
42979005685698193708286233727941595382526544683, 
8108695117451325702581978056293186703694064735) 

382|1| (584775341 19926126376218390196344577607972745895728749, 
16753262295 12584546381 1427438340702778576158801481539, 
8122054393485793893167719500929060093151854013194574) 

445|1| (362650186970550612016862044970863425187, 


-58928948142525345898087903372951745227, 
4743280029253607 266633386 1784516450106) 
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These computations have now been extended up to A < 70000 (Stephens). 
Don Zagier noticed that in this range there are about 38.3% of curves with 
r = 0; 48.9% with r = 1; 11.7% with even r > 2 and 1.1% with odd r > 3, and 
these values vary only slightly within large intervals of the tables. We refer to 
[Si01] for a survey of open questions in arithmetic algebraic geometry. 

3) Let C be given by the equation 


yit+y=2?—Tr+6. 


Then C(Q) = Z°, and the points (1,0), (6,0), (0, 2) form a basis of this group. 
4) For y(y+1) = x(a—1)(%+2) we have r = 2; for y(y+1) = x(a—-1)(a+4), 
r = 2 (compare this with example 1). 
5) Consider the curve y? = x3 + px, p = 877. A generator modulo torsion 
of the group of rational points of this curve has x—coordinate 


_ $75494528127162193105504069942092792346201 
~ 6215987776871505425463220780697238044100 * 


This shows that naive methods of seeking points quickly become inefficient 
(cf. [Cas66], [|CW77], [Coa84] for an educated approach). 


1.3.4 Cubic Congruences Modulo a Prime 


Let p be a prime and F'(X0, Xi, X2) a cubic form with integral coefficients. 
Reducing Ff modulo p, we obtain a cubic form over the prime finite field F,. 
This reduction is called non-singular if it has no common zeroes with its 
first partial derivatives in any extension of F,. We can also apply elementary 
algebraic-geometric ideas to a field K of finite characteristic. The normal 
forms are then slightly more complicated. By making a change of projective 
coordinates and passing to the non—homogeneous equation, we can always 
reduce the equation F' = 0 to the form 


y? + ayzy + agy = x? + aga” + age + ag, 


where a1, 42,03, @4,a6 € K and 


A = —b3bg — 8b} — 27b2 + Ybababe F 0, 


where 
bg = ay +4a2, b4=2a4+a,a3, bg = ax + 4ag. 


3 
c 
The notation 7 = a is used, where 


c4 = 05 — 24b4, cg = —b3 + 36b2b4 — 216bg. 


Then this equation can be further simplified using the transformation x > 
ua’ +r, yr usy’ + su22'r +t in order to obtain the following (cf. [Ta73], 
[Kob87] : 
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1) For p # 2,3: 
y? = 2° + az + ag with A = —16(4aj + 2702 ¥ 0). (1.3.8) 


2) For p = 2 we have that the condition j = 0 is equivalent to a; = 0, and 
the equation transforms as follows: if a; 4 0 (i.e. 7 4 0), then choosing 
suitably r,s,t we can achieve a, = 1, a3 = 0, a4 = 0, and the equation 
takes the form 


y’ + ay = 2? + agx + ag, (1.3.9) 


with the condition of smoothness given by A ¥ 0. Suppose next that 
a, = 0 (ie. j = 0), then the equation transforms to 


y? + agy = 2° 4+ age + ag, (1.3.10) 


and the condition of smoothness in this case is a3 # 0. 
3) For p=3: 


y? = a? + aga” + ager + a6, (1.3.11) 


(here multiple roots are again disallowed). 


The projective curve defined by the respective homogeneous equation al- 
ways has a rational point O = (0: 1:0). 

How many points over F,, that is, solutions of the congruence F' = 0 mod 
p, should we expect? Clearly, the total number (counting O) cannot exceed 
2p + 1, since every finite x gives no more than two values of y. On the other 
hand, of all the non-zero residue classes, only half of them are squares (for 
odd p). Hence we might expect that 2° + ax + b is a square only for about a 
half of the 2’s. 


More precisely, let x(a) = (=) be the Legendre symbol (cf. §1.1.5). Then, 
Pp 


by definition, the number of solutions of y? = u in F, is 1 + y(u). Therefore, 


Card C(F,) =1 sca + x(x* + ax + b)) 
reF, 
=p+1+ S- x(x? + ax +b). 
LeFp 


N.Koblitz in [Kob94] compares the last sum with the result of a random walk 
on a line. After p steps one might expect to be at distance roughly ,/p from 
zero. Actually, one can prove the following remarkable theorem (cf. [Ha37]): 


Theorem 1.4 (Hasse’s Theorem). Let N, = Card C(F,). Then 


|Np — (p+ 1)| S 2vp. 
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An elementary proof of this theorem was given in 1956 (cf. [Man56]). 
Since then, both the algebraic-geometric and the elementary proofs have 
been greatly extended. For a review of the elementary methods, cf. [Step74], 
[Step84], [Step94]. 

We refer to [LaTr76] for the problem of the distribution of Frobenius au- 
tomorphisms for varying p, and of the difference N, —(p+1), which is related 
to the Sato—Tate Conjecture (cf. Chapter I in [Se68a] and §6.5.1). 

The Abelian group structure on the group of points E(F,) on an elliptic 
curve is used in many arithmetical questions. In particular, the case when this 
group is cyclic of large size leads to ECDLP (“Elliptic curve discrete logarithm 
problem”) which is very important for applications in public-key cryptography, 
see [Kob87]. 


1.4 The Structure of the Continuum. Approximations 
and Continued Fractions 


1.4.1 Best Approximations to Irrational Numbers 
Since V2 is irrational, the quadratic form x? — 2y? cannot vanish at integral 


points (x,y) 4 (0,0). The smallest values taken by this form at such points 
are 


x? —2y? =+1. (1.4.1) 


This is an instance of Pell’s equation, which we discussed in §1.2.5; we are now 
interested in it because its successive solutions give the best approximations 
to /2 by rational numbers. 

More precisely, a/b is said to be a best approximation to a if 


|ba — al < |da-—c| 
for al0 <d<b, a¥#c. Every solution to (4.1) can be obtained by setting 


a+ /2b = (14 V2)". 


Table 1.6. 
x y x/y 
1 1 1,0 
3 2 1,5 
7 5 1,4 


239 169 1,414201... 
577 408 1,414215... 
1393| 985 1,4142132... 
3363) 2376} 1,4142136... 


1.4.2 Farey Series 


One way of finding good approximations is connected with a specific procedure 
for enumerating all rational numbers between 0 and 1. Denote by F;, the Farey 
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Table 1.7. 
1 1 1 2 1 3 2 ~=8 1 5 4 3 5 2 #5 
Bb <A 3) Be By tA be We Ae! Bi 22 Bi De De Ee eo fad 


series of order n, which consists of all such numbers in increasing order whose 
denominators are < n: 


Fr ={a/b|0<a<b<n, (a,b) = 1}. (1.4.2) 


Theorem 1.5 ( [HaWr]). For every real number a € [0,1] there exists a/b € 
Fy such that 


a 1 

le-F = nt) ee 

The proof is based on the fact that if a/b, c/d are neighbours in F,, then 
ad — bc = +1. This in turn can be seen by noting that one can go from F;, 
to Fn+i by inserting between a/b and c/d all mediants (a + c)/(b+ d) with 
c+td=n+l. 

In this theorem a need not be irrational, so we obtain some information 
about rational approximations to rational numbers with large denominators. 

If q@ is irrational, this theorem shows that the inequality 


a 1 
la- S| <5 (1.4.4) 


has infinitely many solutions a/b. If a/b is a best approximation, then (1.4.4) 
follows from (1.4.3) with n = b. An efficient way of finding best approximations 
is furnished by continued fractions. This tool also allows us to show that for 
irrational 2 the following stronger inequality has infinitely many solutions 


-Sl< ae (1.4.5) 


1.4.3 Continued Fractions 


(cf. [Khi78], [Dav52], [HaWr]). For an arbitrary real number a, we define a 
sequence of integers a; and real numbers a; by the following rules: ag = [a] 
(the integral part), ag = a, ai41 = 1/(a; — ai), Gigi = [aizi] (¢ > 0). We 
obtain a continued fraction 
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Qm + 


Am+1 


which can be written in a more compact notation as 
a = [a0; a1, G2, ---,; Am, Am4i]- (1.4.6) 
Deleting Qy,+41 in (1.4.6), we get the finite continued fraction 
Cm = [d0; @1,---,@m] 


called the m** convergent of a. The numerators and denominators of the 
successive convergents Cy, = Am/Bm can be calculated recursively starting 
from A_» = B_; =0, A_, = B_2 = 1 with the help of the following relations: 


Ag+i =Gn41AR + Ag-1, 
Brai = On41BR4+-Br-1 (k = -1,0,1,... ). (1.4.7) 
If q@ is irrational, then a,, 4 0 for all natural m. Convergents of even order 


increase; those of odd order decrease, and both sequences converge to a. The 
limit is denoted as the (infinite) continued fraction 


Q = [a9; G1, d2,.--,@n,---]. 
This all follows easily from (1.4.7): first we see that 
ByAp—1 — ApBy_1 = (-1)*, k > 1, 
ByAp—2 — ApBp_-2 = (—1)*-1ax, k > 0, (1.4.8) 
and then 


Apa Ap _ (0) 
By_-i By Beppe i 
Ap_o Ax (—1)*¥-1a, 


= : 1.4.9 
Br_2 By By Br_-2 ( ) 


From (1.4.9) one also deduces that every best approximation to a is equal to 
a convergent A,,/Bm, because 


: | 5 : (1.4.10) 


< ja 
Bm(Bm + By4i) 
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1.4.4 SL2—Equivalence 


The numbers @,,, defined by (4.6) are related to a via fractional linear trans- 
formations 
Am-—1%m “Li Am —2 


= ; 1.4.11 
2 Bm-1%m as Bm-2 ( ) 


Moreover, the determinants of these transformations are (—1)™ (see (1.4.8)). 
In general, two numbers related by a fractional linear transformation of deter- 
minant 1 are called SL2(Z)-equivalent. Hence a and (—1)' a», are equivalent 
in this sense. Conversely, a and ( are equivalent iff a,, = @, for appropriate 
m and n (cf. e.g. [Khi78]). In particular, all rational numbers are equivalent 
to one another. 


1.4.5 Periodic Continued Fractions and Pell’s Equation 


Consider an infinite continued fraction which becomes periodic after a certain 
place ko, with a period of length k: 


a= [ao; Q1,+++,Qky—1; kgs +++ Oko tk—11- (1.4.12) 
Then from (1.4.11) it follows that @ is a quadratic irrational number. 
Example. We have 
ny eal ee eee 


since, denoting by «x the r.h.s. continued fraction, we have 


that is, 
Qe +27 =3420 


and, finally, c = /3. 
The following algorithm efficiently calculates a, for a quadratic irrational- 
ity a. Let N be square-free, 


a = (Py + VN)/Qo, 
N — P® being divisible by Qo. Find successively 
Pi41 = iQ; — P; 


Opi (Mi Pe On, GOs 1, Diane 
Then the P; and Q; are all integers; Q; divides N — Pras and 
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Pini tVvN 
Qeig 
In general, P; and Q; do not grow as rapidly as the numerators and denomina- 


tors of the successive convergents. For example, if |Py| < VN,0 < Qo < VN, 
we have for all i > 1: 


O41 = 


0<P,<VN,0<Q; <2VN, 
A} _ NB? = (-1)"**Qin 
(cf. [Ries85], [Knu81]). At the 7” stage, the calculations consist of four steps. 


1) Pita = [VN] — Ri (Ro = 0), 

2) Qita = (N — P2.1)/Q:i (Q-1 = (N — P5')/Qo), 
3) ait1 = (Pita + [VN])/Qisil, 

4) Ri41 = the residue of P,,, + [VN] modulo Qj41. 


This algorithm can be used to calculate efficiently the smallest solution to 
Pell’s equation. In fact, if a? — Nb? = 1, then we have a? > 1+ N, b? > 1, so 
that 

eles 
b 2b2/N- 
Hence a/b is one of the convergents of JN. 


Example. The smallest solution to x? — 43y? = 1 is 2 = 3482, y = 531. Its 
calculation by the method described above is protocolled in table 1.8. 


Table 1.8. 
i -2/-1/0/1/2);3/)4/5)]6/] 7 8 9 
Qi 6/ 1) 1) 3) 1) 5) 1] 8 1 1 
P; 0;6}1/5)4),5)5) 4 5 1 
Q: 1|}7/6}/3/9)/2]/9] 8 6 7 
R; O;}5}1/2)1),1}2)] 1 5 0 
A; 0} 1] 6 | 7 $13] 46 |59)341/400/1541]1941)/3482 
B; 1} 0/;}1]1 42) 7) 9) 52] 61 | 235 | 269} 531 
A? — 43B? -7| 6 |-3] 9 |-2} 9 |-3] 6 | -7 1 


1.5 Diophantine Approximation and the Irrationality of 
¢(3) 
1.5.1 Ideas in the Proof that ¢(3) is Irrational 


One of the amazing mathematical inventions of recent time showing the vast 
undiscovered power of elementary methods in number theory, was the proof 
of the irrationality of ¢(3) = )>°°_, n~® found by the French mathematician 
Apéry. This proof was first presented in June 1978 in the conference Journée 
Arithmétique de Marseille—Luminy. 

We follow here an informal exposition of the proof due to van der Poorten 
(cf. [vdP79]), who notes the original mistrust of the proof among other math- 


ematicians, which was at first taken as a collection of mysterious statements. 


1) For all integers a1, do, ... 


ee Mectseine, 1 
yo eee 8 (1.5.1) 


a aa n-1)3 
cy =2>>$ a . (1.5.2) 


3) Consider the recurrence relation: for n > 2 


n> Un — (34n? — 51n? + 27n — 5)tn_1 + (n—1)3u,_2=0, (1.5.3) 


and let b, be a sequence defined by the initial conditions bp = 1,6; = 5 
and the relation (1.5.3). Let an be the sequence defined by (1.5.3) and 


the initial conditions ag = 0, a; = 6. Then the denominators of the 
rational numbers a, divide 2[1,2,...,n]® where [1,2,--- ,n] denotes the 
least common multiple of the numbers 1,2,...,n. 


4) The sequence a,,/b, converges to ¢(3) rapidly enough for one to establish 
irrationality of ¢(3). Moreover, for « > 0 and for all integers p,q > 0 with 
q sufficiently large the inequality holds 


p 1 
|¢(3) - A > are 6 = 13.41782... (1.5.4) 
One has the following continued fraction expansion: 
6 
¢(3) = 76 (1.5.5) 
p(0) 96 
p(1) = 
p(2) 
6 
n 
p(n — 1) 
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(3) = 8 | 1 | 64 | 729 | 4096 
~ 5 117— 535— 1436— 3105— 
| no 


34n3 + 51n2 + 27n+5 


1.5.2 The Measure of Irrationality of a Number 


In §4 of Chapter 1 we noted a link between the property of a number ( being 
irrational and the existence of infinitely many good rational approximations 
p/q to B, i.e. such that the equality holds 


1 
pale 
qd q 


Analogously one could state the following criterium for the irrationality of a 
number: if there exists 6 > 0 and a sequence {pp/qn} of rational numbers 


{pn/dn} x p such that 


1 


See (n= 1,2,...,), (1.5.6) 


In 


then ( is an irrational number. The use of this criterium gives an interesting 
measure of irrationality: if |G — a < gs and gp steadily increase in such a 
way that gn < gant for sufficiently large n and « > 0 then for any fixed e > 0 


and for all sufficiently large p,q > 0 the following equality holds: 


1 
ght) /(6—K) +e * 


S (1.5.7) 


5 en 
dn 


In the interesting case when q, increases geometrically, i.e. dn, C, a > 0 one 
could take for « an arbitrarily small positive integer, and the exponent in 
(1.5.7) becomes 1 + (1/6) which is called the irrationality degree of (3. 

Surprisingly, the method of Apéry turned out also to be applicable to the 
number 


co 
C2) =i a 6; 
n=1 
whose transcendence is well known. However Apéry’s proof implies the in- 
equality 


1 
q?' te 


6’ = 11.85078..., (1.5.8) 


for all e > 0 and gq sufficiently large. One also knows that the irrationality 
degrees of 7? and ¢(3) are not greater than @ and 6’ respectively. 
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1.5.3 The Thue—Siegel-Roth Theorem, Transcendental Numbers, 
and Diophantine Equations 


(cf. [Roth55], [Dav58], [Spr82], [Fel82], [Shid87], [Maz86]). This famous the- 
orem states that if @ is an algebraic number, i.e. a root of a polynomial 
f(X) = anX”" + Gn_1X" 1 +--+ + a9 (a; € Z), then for an arbitrary fixed 
€ > 0 and all sufficiently large q the following inequality holds: 


p 1 
o-2| > ae (1.5.9) 
In other words, if we take arbitrary positive constants C’ and ¢, then there 


exist only a finite number of approximations x/y of 3 satisfying the inequality 


1 


< ae (1.5.10) 


x 
e-7 


In particular, if the inequality (1.5.6) holds for a sequence (p,,/qn) with a fixed 
5 > 1, then the number ( must be transcendental (i.e. not algebraic). However 
it turns out that this condition defines only a subset of the transcendental 
numbers of measure zero. 

Note that the theorem of Thue—Siegel—Roth has very important applica- 
tions in the theory of Diophantine equations, which can be explained by the 
example of the equation 


X?—5Y2 =m (m#0) (1.5.12) 


for a fixed integer m. This equation resembles Pell’s equation, but its degree 
is greater then 2. If (2, y) a solution of (1.5.11) then the following equality 
holds: 


|--¥]<— (c= Ym). (1.5.12) 
y ly| 


However if we take « > 0 such that 2+¢ < 3 then Roth’s theorem implies that 
there are only finitely many solutions for the inequality (1.5.12) and hence for 
the equation (1.5.11). 

Using algebraic geometric methods, but resting on essentially the same 
idea, Siegel established the following result: 


Theorem 1.6 (Siegel C.-L. (1929)). Let f(X,Y) be an irreducible poly- 
nomial with integer coefficients. Then the equation 


f(X,Y) =0 (1.5.13) 
has only finitely many integral solutions excluding the two special cases: 


a) The curve f(X,Y) =0 admits a rational parameterization: substituting to 
(1.5.13) non-zero rational fractions X = p(t)/q(t), Y = r(t)/s(t) € Q() 
this equation becomes an identity of rational functions of t. 
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b) The projective envelope of the curve (1.5.13) has not more than two points 
at infinity. 


In particular, the Thue equation f(x,y) = m where f(z,y) € Z[z,y] is 
an irreducible form of the degree n > 3, has only a finite number of integral 
solutions. A. O. Gelfond (cf. [Ge83] ) has shown that an effective bound for 
solutions of the Thue equation can be obtained if one has a good lower bound 
for the module of linear forms of logarithms of algebraic numbers a1,..., Qn 
(with integer coefficients). Such estimates were obtained by [Ba71], making 
it possible to solve a number of important arithmetic problems. These prob- 
lems include besides bounds for solutions of Diophantine equations ([Spr82], 
[Step84], [Schm79], [Bak86], [La60], [La62]), also effective bounds for the class 
numbers of algebraic number fields and the numbers of equivalence classes of 
quadratic forms ([Ba71], [St67], [St69]). An effective upper bound (see [Bak86] 
[Spr82], [ShT86] ) 

y? <a" < expexpexp exp 10? 


was obtained by Baker’s method for solutions of the Catalan equation 
oad _ y =] 


which provide an example of an exponential Diophantine equation systemati- 
cally studied in [ShT86]. Catalan asked in 1843 whether 8 and 9 are the only 
consecutive perfect powers. A recent solution of this problem by P. Mihailescu 
(who answered the question affirmatively) has become one of the main arith- 
metical highlights of the past few years, cf. [Mih03], [Bi02]. 


1.5.4 Proofs of the Identities (1.5.1) and (1.5.2) 


First of all the equality 


K 
esti! 1 ae, oe 
> a1 a2 Qk-1 = a142 QkK-1 (1.5.14) 
7 (e@tai)++-(@t+an) 2  (e@+a;)--- (z+ ax) 
is easy to check. We may write the right hand side in the form 

@14Q°°* Ak-1 
(a +a1)-+-+- (a + az) 


and note that each term in the left hand side is equal to A,_; — Az. The 
identity (1.5.1) follows immediately from (1.5.14). 


Ao — Ax, A, = 


Now substituting « = n? and a, = —k? and taking k < K < n—1 we 
obtain 
ss (0 a Ces) ca (-1)"""(n — 1)? 
mer (n2 — 12)... (n2? —k2) nn? n2(n? — 12) --- (n2 — (n—1)2)’ 
2(-1)""1 
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ee 1 ki(n—k)! 
Writing En,k = 5 ha we have 
(-1)'-(k 1) 
n? — 12)... (n? — k?) 


(—1)¥ n(En,r _— En—1,k) — 


from which follows the identity 


N n-1 N 1 N (-1)-1 

De (—1)"(En,b — En—1,k) = ye me 200 n3(2") = 

n=1k=1 n=1 n=1 n 

5-1) (En,k - = ae S- Gos (1.5.15) 
Ee), eG) ae 


tends to zero as N — oo. 


1.5.5 The Recurrent Sequences a, and b,, 

Write the recurrence relation (4.3) satisfied by a, and by: 
nan — P(n— 1)an-1 + (n- hare eae = 0, 
nbn, — P(n —1)bn-1 + (n — 1)?bn-2 = 0, 


where P(n — 1) = p(n) = 34n® — 51n? + 27n — 5. If we multiply the first 
equality by 6,1 and the second by a,_;, and then subtract second from the 
first, we get 


n3(anbn—1 = Gn—1byn) = (n - 1)? (a, 1bn—2 — An—2bn-1). 


Recall that by the initial conditions we have a,bp — anb} = 6-1—0-5 = 6, 
which implies 


Anbn—1 ae Off 0F = 3° (1.5.16) 
n 
This easily leads to the relation 
a = 6 
eget Pe ——— = 0(b-”). 1.5.1 
ee -F]= Oo ppg 0m) (1.5.17) 
k=n+1 

This is proved by the induction starting from the equality ¢(3) — $° = ¢(3). 


The absolute values of the numbers b, can be easily estimated using the 
relation 
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by — (84 —51n7! + 27n-? — 5n7*)by_1 + (1 —3n7' + 3n-? — n-3)d,-2 = 0. 


Using the fact that the “linearized characteristic polynomial” of this recurrence 
relation is 2? — 34x + 1 and has roots 17 + 2/2 = (1 + V2)*, we obtain the 
estimate 


bn = O(a”), a= (14+ V2)". 


Assume for a moment that the statement in 3) on integrality of the num- 
bers b, and on the denominators of a,, dividing 2[1,2,--- ,n]? is already 
proved. Then it is easy to complete the proof as follows. Let 


Pr = 21, 2, [=* n}Pan, dn = 21, 2, aes ste bes 


where pn, dn € Z. The value of [1,2,--- ,n] can be estimated using for example 


a rough form of the prime number theorem: }/,,<, 1 © x/log x. Then 


[1,2,--- n = [[ pte! ce?! < Il ne nr/logn =e. 


pen psn 


Hence qn = O(a"e®”) and 


| = O(b;,”) = O(a-*") = O(G, °F) 


with the constant 6 = (loga — 3)/(log a +3) = 0.080529... > 0. According to 
the irrationality criterium in §1.5.2, we obtain the statement (1.5.6) in which 
the irrationality degree is not greater than 1 + (1/6) = 6. 

The statement on the denominators of the numbers a, and by, is one of 
the most difficult points of the proof. Apéry proved this fact by explicitly 
constructing the sequences a,, and b,: 


n 2 2 n 2 2 
n n+k n n+k 
m= (7) ( k ): m= (f) ( k ) ena 
k=0 k=0 


where 


n 1 k (=1ymct 
Ck Se pa Tees (1.5.18) 
m 
m 
It follows from these formulae that the numbers a, are integral. The bound 
on the denominators of b,, is given by the fact that all of the numbers 


k 
2(1, 2, ne N° Cnk e ) 


are integers. The proof of this uses an estimate for the maximal power with 
which a prime p can arise in the denominator of each term in the sum (1.5.18) 
defining Cn 4. 
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Further irrationality properies of zeta values at odd positive integers were 
studied recently in [Riv01], [Zu95], [BaRi01], see also [Zu]. In particular, 
W.Zudilin proved that at least one of ¢(5),¢(7), ¢(9), ¢(11) is irrational. 

There are interesting attempts to prove the irrationality of Euler’s con- 
stant and to understand its arithmetic nature, cf. [Son04]. In [BelBr03] Euler’s 
constant 7 is interpreted as an exponential period. The ring P of periods 
is generated by the numbers of the form [,w where X is a smooth alge- 
braic variety of dimension d defined over Q, D C X is a divisor with nor- 
mal crossings, w € 24(X) is an algebraic differential form of degree d on 
X,y € Ha(X(C), D(C); Q), cf. Chapter 5. This ring was introduced by M. 
Kontsevich and D. B. Zagier in [KoZa01] (see also [Del79]). 


1.5.6 Transcendental Numbers and the Seventh Hilbert Problem 


It is useful to compare the given elementary proof with the highly developed 
theory of transcendental numbers, i.e. the numbers, which are not roots of 
polynomials with rational coefficients. The existence of such numbers was 
first established by Liouville in 1844; then Hermite proved the transcendence 
of e (in 1873), and Lindeman in 1883 proved the transcendence of 7 ([Ba75], 
[Bak86], [Shid87]). In the framework of a general theory A. O. Gelfond (see 
in [Ge73]) and Th. Schneider (cf. [Sch57]) obtained a solution to the seventh 
Hilbert problem (Hilbert D. (cf. [Hil1900] ), [Fel82]): to prove that “the power 
a? of an algebraic base a to an irrational algebraic exponent ( (e.g. QV? or 
e™ = i~* is always transcendental, or at least irrational”; “... we found it very 
probable that such a function as e’*, which evidently takes algebraic values 
for all rational values of the argument z, will take, on the other hand, for 
algebraic irrational values of z, only transcendental values”. 


1.5.7 Work of Yu.V. Nesterenko on e”, [Nes99] 


One of the most impressive achievments of the last decade in the theory of 
transcendental numbers was the work of Yu.V. Nesterenko on the algebraic 
independence of 7 and e”, see [Nes99] and [Nes02]. This result is based on the 
study of the transcendence degree of a field generated by numbers connected 
with the modular function j(7). In [Nes99] the algebraic independence of 7, e” 
and I" (4) is also established by this powerful method. For proving this result, 
the problem is reduced to estimating the measure of algebraic independence 
for the numbers 7 and I'(4). 


2 


Some Applications of Elementary Number 
Theory 


2.1 Factorization and Public Key Cryptosystems 


2.1.1 Factorization is Time-Consuming 


In order to multiply two primes p < q given their binary expansions, it suffices 
to perform C(log q)* bit-operations (see section 1.1.1). Suppose now that we 
are given n = pq and are asked to find p and gq. If p~ q ~ 4/n then the naive 
repeated trial of all d < \/n would require more than 


1 
Cy/n = Cexp (5108) 


divisions with remainder. This exponential growth of the running time makes 
the factorization of even rather small numbers unfeasible, at least unless one 
invents more efficient algorithms. For example, consider the factorization 


(107 — 1)/9 = 241573142393627673576957439049 x (2.1.1) 
45994811347886846310221728895223034301839. 


With some patience, one can multiply the two numbers on the right hand side 
in an hour or two on a sheet of paper. However, the factorization of the result 
by the trial-and-error method would take about 10!° years of running time (if 
one division requires 10~° sec: cf. [Sim79], [Pet85], [Wun85], [Ya02]). 

In real life, the factorization (2.1.1) was first found in 1984 with the assis- 
tance of a CRAY supercomputer and fairly advanced factorization methods, 
which made this task feasible if not inexpensive. 


2.1.2 One-Way Functions and Public Key Encryption 


We may consider the binary expansion of n = pq as a message which can 
be encoded in many other ways, e.g., by giving expansions of p and q. The 
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rules explaining how to pass from one form to another from the information- 
theoretical viewpoint can be called enciphering, encryption and deciphering. 
Experimentally, one knows that some functions are easy to compute but dif- 
ficult to invert (one-way, or trap—door functions). It is then natural to try 
to use these functions in cryptography. We recall that cryptography studies 
problems of information handling concerned with keeping and breaking se- 
crecy of messages. One-way functions are used in the so called public key 
encryption schemes, which were suggested in the seventies and revolutionized 
this domain. 

Before explaining the design of one such scheme, we must stress however 
that there are no theoretical lower bounds on computational complexity justi- 
fying our experimental observation that complexity of factorization far exceeds 
that of multiplication. In principle, we cannot exclude the possibility that a 
very efficient algorithm for factorization (or for inverting any given trap—door 
function) might eventually be found. This is one of the basic problems of 
computational complexity theory (cf. e.g. [GJ79], [DH76] [CoLe84], [ARS78], 
[Ya02]). If, however, we assume this experimental fact, we can use it in order 
to generate new encryption schemes with remarkable properties. 

We shall now describe the first “public key cryptosystem” suggested by 
L.Adleman, R.Rivest, and A.Shamir in 1978, cf. [ARS78]. 


2.1.3 A Public Key Cryptosystem 


Imagine a system of users U,, U2, U3,... From time to time any pair of users 
may need to exchange messages that should remain secret to other users or 
outsiders. 

In a classical cryptosystem, they should first share keys and keep them 
secret. A public key system avoids this last restriction: secret pairwise com- 
munication becomes possible using only information open to everybody. Such 
a system can be devised as follows. 


a) Every user U; choses two large primes p; and q;, and two residue classes 
e;,d; mod n;, where nj = piq;, such that e;d; = 1mod y(n;) where 
p(n:) = (p; — 1)(q — 1) denotes the Euler function (cf. 1.1.4). 

b) The numbers (e;,;) are made public for all users. 


We argue that it is unfeasible to calculate d; knowing only (e;, n;), so that 
d; can be considered as a secret known to U; alone. In fact, we shall show 
that an efficient algorithm for calculating d; would also find efficiently the 
prime factorization of n;, which we assumed to be difficult. Suppose that 
we know d;. We then know that y(n;) divides e;d; — 1. If we knew y(n;) 
itself then we could easily find p; and q;, since pj + q@ = ni + 1— y(ni) 
and pi — qi = V (pi + Gi)? — 4n;. One can show that even knowing only a 
multiple of y(n;) suffices (cf. [Mil76], [Wag86]) to find p; and qj. 
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c) Suppose that a user U; wishes to transmit to U; a coded message which is 
a sequence of bits. He first breaks this sequence up into blocks of length 
[log n,;], then considers each block as a residue class m mod _1n, and finally 
encodes it as the residue class m°i mod n,. Thus, (n;,e,;) serves as the 
encryption key of the j'® user (recall that it is common knowledge). 

d) Having received the encoded message, U; decodes any block b mod n, 
by computing 6“ mod n, (recall that he knows the deciphering key dj). 
This is easily checked with the help of Fermat’s little theorem (1.1.4). 


Clearly, the details of such a scheme can be varied ad infinitum. For ex- 
ample, one can devise an authentification procedure (“electronic signature”) 
which uses a form of a secret message from U; to U; allowing U; to convince 
a third party (a “judge”) that the author of the message is U;, so that it is not 
faked by U; himself. This can be crucial for certain financial transactions. 

Denote by E; the encoding map for messages addressed to U; and by D; 
his deciphering map. Then EF; is public domain while D; is U;’s property. For 
an arbitrary plain message M we have D;(E;(M)) = M and E;(D;(M)) = M. 
The user U; sending his message M to U; uses as his signature S = D;(M) 
and transmits to U; its encoded version E;(5). In his turn, U; first computes 
S = D,(E;(S) and then M = E;(S) using the public key E;. The addressee 
can convince a judge that M comes from U; because only by applying F; can 
one transform S into a given sensible message M. On the other hand, the 
addressee cannot fake S' since he does not know D,. 

We shall concentrate now on the number-theoretical rather than the in- 
formation—theoretical aspects of public key cryptosystems. We shall describe 
how some classical number-theoretical results can be applied to two particular 
problems in this domain. 

Problem 1. How does one produce large primes? 

We want to stress that we really need an efficient method for mass produc- 
tion of “sufficiently random” large primes, in order to allow a user to compute 
(with the assistance of a large computer) his customized pair (p;,q;), and to 
be sure that a different user will get a different pair. 

Problem 2. How does one factorize large integers? 

This problem is crucial for a third party wanting to break the cryptosystem 
and, of course, for the designers wanting to secure its infallibility (cf. [DH76], 
[ARS78], [Kah71]). 


According to A.Wiles [Wi2000], one change in number theory over the last 
twenty years is that it has become an applied subject (Pehaps one should 
say it has gone back to being an applied subject as it was more than two 
thousand years ago). Public key cryptography has changed the way we look 
at secrecy and codes. The RSA system depends on the practical difficulty 
of factoring a number. The seventeenth century problems of generating large 
primes, primality testing and factoring now pose new and precise problems. 
How fast can algorithms for answering these questions be? The question of 
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primality testing is solved (cf. §2.2.4, §2.2.6). The other two are not theoreti- 
cally solved, although the first of the problems seems much easier in practice 
than the third. 


2.1.4 Statistics and Mass Production of Primes 


The asymptotic law of the distribution of primes (or prime number theorem) 


x 
is m(a) ~ are (cf. 1.1.6). We can start then with a naive assumption that if 
og x 


N is not too small with respect to z then between x and z+ N there should be 
about N/log x primes. For example, if the least prime following x is bounded 
by x + (log x)™, then one can just check successively 2,2 +1,2+2,.... The 
complexity of the prime production would then be of the same order as the 
complexity of the primality testing algorithm used. If one can take M = 1, 
then to produce a prime of order about 2! one should first produce a random 
number « of that order, and then test about (log 101°°)/2 ~ 115 odd integers. 
If there is a primality test for y, which is polynomial in log y, then this is a 
feasible task. 

We shall discuss in the following subsection efficient probabilistic primality 
tests. 

We should remark however that such absence of large gaps between primes 
is not proved and probably is not even true. All known results on the gaps 
give upper bounds which are powers of x (see [HB88], [Hild88], [Zag77]). We 
quote some of them: 


7/12 4 
Motel aye i (: 40 (“anes ) (2.1.2) 
av 


log x 
) 


a(a+ 2°) — x(x) > Me eae 


for 6 > 11/20 where C(6) is a positive function. For almost all x, a stronger 


result is known: 


7? 


ae > 0.15 
m(a+a°)—1(x) > ina 
if @ > 1/12. 

For an interesting discussion of large gaps between primes, see [Ries85], 
p.84 and [Hild88], [Zag77]). 


2.1.5 Probabilistic Primality Tests 


Some modern efficient primality tests actually check a weaker property con- 
nected with the notion of Eulerian pseudoprimes (cf. §1.1.5). We recall that 
n is called an Eulerian pseudoprime modulo 6 if n and 6 are relatively prime, 
and 
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(n-n/2 — ( & 

b = (—] mod n. (2.1.3) 
n 


Primes are pseudoprimes modulo every 0b: this follows from the fact that 
(Z/nZ)* is cyclic (cf. section 1.1.5). One readily sees that for composite n, 
(2.1.3) fails for at least half of the residue classes in (Z/nZ)*. A probabilistic 
primality test based upon this observation consists of checking (2.1.3) for, say, 
several hundred randomly chosen b. If an n passes such a test it is sometimes 
called “a commercial prime”. Commercial primes are used in public key cryp- 
tosystems although, strictly speaking, their primality is not established by the 
test. 

It turns out that such a proof could be given if one assumes the generalized 
Riemann conjecture on the zeroes of the Dirichlet L—functions. Namely, one 
can deduce from this conjecture that the validity of (2.1.3) for all b < 2(logn)? 
implies that n is prime (cf. [Mil76], [Wag86]). To check this property, it suffices 
to perform O((logn)***) divisions with remainder, for any € > 0. 

This Solovay-Strassen primality test admits some interesting variations, 
e.g., the Miller—-Rabin test (see [SolSt77]), [Mil76], [Ra80], [Ries85], [Schr84], 
[Kob94]). It is based on the following notion of strict pseudoprimality. Suppose 
that n is pseudoprime modulo b, so that b”~! = 1 mod_ n. We shall now 
calculate all consecutive square roots of the left hand side, that is, b(—))/?" 
fori =1,...,s where t = (n—1)/2° is odd. If n is prime then the first residue 
class in this sequence distinct from 1 should be —1. We shall call n strict 
pseudoprime if either b& = 1 mod _n or for some 0 < r < s we have 


b? * =—1 mod n. (2.1.4) 


The Miller-Rabin test consists of checking this property for a set of randomly 
chosen b. 


It was noticed by F.Morain [Mor03a] that in practice the algorithm of 
Miller is quite long, because it needs to compute numerous modular exponents, 
and a faster ECPP method is discussed in Section 2.2.6. 


In Section 2.2.4 we describe an important recent theoretic discovery that 
primes are recognizable in polynomial time: the work of M. Agrawal, N. Kayal 
and N. Saxena [AKS] who found that a polynomial version of Fermat’s Little 
Theorem (1.1.2) leeds to a fast deterministic algorithm for primality testing: 
the time of this algorithm is given by O(log!” N), where the notation O(t(N)) 
for O(t(n) - poly(logt(.NV))) is used for a function t(N) of N. 


2.1.6 The Discrete Logarithm Problem and The Diffie-Hellman 
Key Exchange Protocol 


The Diffie-Hellman key exchange is the first public-key cryptosystem ever 
published [DH76]. 
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In order to communicate an important information to Bob, Alice wish 
to use this algorithm as follows: Alice and Bob agree on a prime number p 
and an integer g that has order p— 1 modulo p. (So g?~! = 1(modp), but 
g” # 1(modp) for any positive n < p—1.) Alice chooses a random number 
n <p, and Bob chooses a random number m < p. Alice sends g”modp to 
Bob, and Bob sends g™modp to Alice. Alice can now compute the secret key: 


s=g™" = (g")” (modp). 
Likewise, Bob computes the secret key: 
s=g™" = (g")™ (modp). 


Now Alice uses the secret key s to send Bob an encrypted version of her 
message. Bob, who also knows s, is able to decode the message. 

Non-authorized persons can see both g” (modp) and g™ (modp), but they 
aren’t able to use this information to deduce either m, n, or g™” (modp) 
quickly enough. 


2.1.7 Computing of the Discrete Logarithm on Elliptic Curves 
over Finite Fields (ECDLP) 


The Abelian group structure on the group of points E(F,) on an elliptic 
curves is used in many arithmetical questions. In particular, the case when 
this group is cyclic of large size leads to ECDLP (“Elliptic curve discrete 
logarithm problem”) which is extremely important for applications in public- 
key cryptography, see [Kob87], [Kob98], [Kob01], [Fr01], [Men93], [Kob02]. 
This idea was independently proposed by Neal Koblitz and Victor Miller in 
1985, and since then there has been an enormous amount of research on the 
topic. The computational problem on which the security depends is the elliptic 
curve discrete logarithm problem: Given an elliptic curve E over a finite field 
F, and two points P,Q € E(F,), find an integer A (if it exists) such that 
Q = |[A|P. If the field size q is sufficiently large, and if the elliptic curve E 
avoids various special cases, then this seems to be a difficult computational 
problem. 

Numerous applications of arithmetical algebraic geometry to cryptographic 
constructions were discussed in [Fr01], [Men93], and in other good sources: 
[Kob2000], where the problem of computing the orders of elliptic curve groups 
is discussed in some detail as well. For example, we can learn about the cryp- 
tographic significance of old number-theoretic questions such as the existence 
of infinitely many Sophie Germain primes and Mersenne primes. 


2.2 Deterministic Primality Tests 


Probablilstic polynomial-time primality tests have been known for many years. 
There is a well-known almost-polynomial-time ((log n)!°2!°8!°") determinis- 
tic algorithm due to Adleman, Pomerance and Rumely (1983) cf. [APR83], 
[LeH.80], [CoLe84], and also a randomized algorithms due to Goldwasser-— 
Kilian, cf. [GK86], [GK99], Atkin—-Morain [AtMo93b], and Adleman—Huang 
[AdHu92] which give certificates for both primality and compositeness in ex- 
pected polynomial time on all inputs. This method of primality proving using 
elliptic curves, the ECPP was further developed by F.Morain, [Mor98a]. 

In August 2002, a deterministic polnomial-time algorithm was found by 
M. Agrawal, N. Kayal and N. Saxena from the IIT Kanpur. Among other 
things, we give an exposition of this result in this section. 

We describe some deterministic primality tests 


a) Adleman, Pomerance and Rumely (1983): they have subexponential run- 
ning time, and the proofs that they work are unconditional (i.e. they do 
not use any unproved conjectures). 

b) A resent discovery that primes are recognizable in polynomial time by M. 
Agrawal, N. Kayal and N. Saxena who found that a polynomial version 
of Fermat’s Little Theorem (1.1.2) led to a fast deterministic algorithm 
for primality testing: The time of this algorithme is given by O(log’? n), 
where the notation O(t(n)) for O(t(n)-poly(logt(n))) is used for a function 
t(n) of n. 

c) Elliptic curves and primality proving, the ECPP (Elliptic Curve Primality 
Proving by F.Morain, see [AtMo93b], [Mor98a], [Mor03]. 


2.2.1 Adleman—Pomerance—Rumely Primality Test: Basic Ideas 


There are two main variants of this algorithm cf. [APR83], [LeH.80], [CoLe84]: 
a simpler, probabilistic version, and a deterministic one. Its running time is 
bounded by 
log ne log log log ar 

where c is an effective constant. The power in this expression grows so slowly 
that this bound can be considered “almost polynomial”. All previously known 
deterministic primality tests had exponential running time (e.g., Pollard’s test, 
described in [Pol74], [Ries85], requires about 


1 
nate = exp ((; + :) log n) 
operations). 


The algorithm consists of the following steps. 
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a) One checks a series of conditions generalizing the congruence (2.1.3) for 
the Jacobi symbol. If n fails to satisfy any of these conditions, then it is 
composite. 

b) If n passes the first stage, the test furnishes a small set of integers con- 
taining all divisors r of n not exceeding \/n. It remains to check whether 
n is divisible by at least one element of this set. 

c) The set of potential divisors r is determined by specifying their residue 
classes modulo an integer s > ,/n, which in turn is a product of several 
distinct primes q. In view of the Chinese Remainder Theorem (cf. (1.1.5)), 
it suffices to specify r mod q for all qg dividing s. 

d) Every q dividing s satisfies the following condition: gq — 1 is a product of 
several distinct primes taken from a fixed set {po,...,p%}. These primes 
are called the initial primes, and the q are called the Euclidean primes, 
because they are constructed by the method used in Euclid’s proof that 
the set of primes is infinite: 


GS Tp py plies of — 0 ort, 


To estimate the running-time, one has to use a hard theorem from analytic 
number theory (cf. [Pra57]) which guarantees that even for a small set of 
initial primes, the product of all Euclidean primes generated by them can 
be large. More precisely, given n, one can determine a set of initial primes 
{po,---,;Pk} whose product t is bounded by 


k 
t= [> < log(n® 8 98 !08") (n > e°), (2.2.1) 
i=0 


whereas the product of the corresponding Euclidean primes is bounded 
from below by 


s= [J a>va, (2.2.2) 


(q-1)|t 


where cz is a computable positive constant. Notice that in this situation 
the number of Euclidean primes is bounded by z(t + 1) < ¢+1. For any 
n < 10°°° one can take t = 2-3-5-7-11-13-17-19. 


e) To determine r mod gq, one actually calculates the discrete logarithms 
ind(r, g, q) of all possible r with respect to a fixed generator g of (Z/qZ)*. 
These logarithms are in turn determined by their residue classes ind(r, g, q) 
mod p; where p; runs over all initial primes. Again, this follow from the 
Chinese Remainder Theorem. 


We shall now describe the algorithm in more detail. 
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2.2.2 Gauss Sums and Their Use in Primality Testing 


For an odd q, the Euler criterion (4) = q("~))/? mod_n can be rewritten in 
the form 


(£) (-1)*r “= q°F mod n, (2.2.3) 


which gives a formula for calculating the quadratic residue symbol of n modulo 
q. In the algorithm we discuss here, one uses generalizations of this formula 
to arbitrary p*® power residue symbols for initial primes p. In order to explain 
these generalizations, we must introduce Gauss sums, which were initially 
used in one of Gauss’ proof of the quadratic reciprocity law (cf. below). 

One calculates the number of solutions of a congruence x? = a in (Z/qZ)* 
with the help of the Dirichlet characters of order p modulo q, that is, the 
homomorphisms y : (Z/qZ)* — C%*. Every such character is defined by the 
image exp(k - 27i/p) of a generator g of (Z/qZ)*. The number of such char- 
acters is p if p divides g — 1, and 1 otherwise. If q is prime, we have 


Card{x € (Z/qZ)* |x” =a}= S > x(a). (2.2.4) 


xlxP=1 


In particular, for p = 2 this is 1 + (*). The sum in the right hand side of 


q 
(2.2.4) vanishes iff y(a) 4 1 for some x. This happens only if p|(q— 1) and a 
is not a p'* power modulo q. If p does not divide q — 1, both sides are equal 
to 1. Finally, if p|(q—1) and a is a p'® power, both sides are equal to p. 
One way to understand Gauss sums is to view them as discrete analogues 
of the gamma function I'(s), which for Re(s) > 0 is given by the integral 


I'(s) = ie e Yye—. (2.2.5) 


Here the integrand is the product of an additive quasicharacter of R (the 
homomorphism y + e~¥) and a multiplicative quasicharacter y + y* of R%. 
One integrates this over the positive reals with respect to the multiplicative 
invariant measure ae 

In order to get a Gauss sum, one should replace here R by Z/NZ for some 
N > 1; e~¥ by an additive character Z/NZ > C* : y# CX, Cw = exp (2), 
and y* by a multiplicative character x : (Z/NZ)* — C*. A Dirichlet character 
x : Z — C corresponding to y and denoted also y is defined by y(a) = 
x(a mod N) for (a, N) = 1 and by x(a) = 0 for (a, N) > 1. The Gauss sum 
G(x) is, by definition, 


G(x) = DL Xx(@)bn- (2.2.6) 
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For a € Z, the following notation is often used: 


N-1 
Ga(x) = D2 x(a) cn. 


Since the formulae (2.2.5) and (2.2.6) are obviously similar, they define func- 
tions with many similar properties. 

To state them, we need the important notion of a primitive Dirichlet char- 
acter. A character x is primitive modulo N if it is not induced by a character 
modulo M for any proper divisor M of N. Equivalently, the restriction of x to 
any subgroup Hy = ((1 + MZ)/(1+ NZ))™ is non-trivial. If x is primitive, 
we have 


Ga(x) =X(a)G(x) (ae Z), (2.2.7) 
G(x) = x(-1)G@), (2.2.8) 
IG(x)? =N. (2.2.9) 


Property (2.2.7) corresponds to the integral formula 


i: ety =a °I(s) (Re(s) > 0), 
0 y 


and (2.2.9), rewritten in the form G(x)G(y~') = x(—1)N, corresponds to the 
functional equation 
T 


I(s)FU-s)=—- 


ssin7Ts 
From (2.2.7)—(2.2.9) one readily deduces the quadratic reciprocity law. Let 
us prove, for example, the main formula 


(<) (7) = (-1) 7S, (2.2.10) 


where / and q are odd primes. Notice first that the quadratic residue symbol 
x(a) = (2) is a primitive Dirichlet character modulo q. The corresponding 


quadratic Gauss sum G(x) is an element of the cyclotomic ring of algebraic 
integers R = Z[C,]. In any commutative ring the congruence (a + b)! = a! + 
b' mod IR holds because the binomial coefficients C} are divisible by 1. Since 
x!(a) = x(a) = £1, we have 


G(x)! = Gi(x') mod IR, Gi(x') = x(DEQd), 


so that 
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G(x) 1 = (<) mod IR. (2.2.11) 

On the other hand, y = x, and from (2.2.9) it follows that 
G(x)? = x(-lq=(-1)'* ¢. (2.2.12) 


Representing the left hand side of (2.2.11) as G(x)?*> we obtain 


q-1 1-1 1 


(Ayr sg (<) mod IR. (2.2.13) 


Finally, (2.2.13) and Euler’s criterion 


t-1 l 
q?2 = (<) mod | 
qd 
give (2.2.10). 


For Z/NZ, there is also an analogue of the beta-function 
1 
B(s,t) = } xo—1(1 — 2)’ "de = 
0 
e d 
i: _¥ _-Y —(Re(s), Re(t) > 0). 
R 


Skye g 


It is called the Jacobi sum depending on two Dirichlet characters y,7 mod N. 
By definition, 


Ixv= YO x@wd-2)= SS xwOe)+y). (2.2.14) 


amod N ymod N 


(The equality of these two expressions can be established by the change of 
variables y(1 — 2) > x,a(1+y) y). If x,v, and yw are primitive modulo 
N, we have 


T(x, 0) = GQX)GW)/G(x¥) = Ib, x), (2.2.15) 


which corresponds to the classical identity B(s,t) = ['(s)I'(t)/I'(s +t). In 
fact, let us calculate the product 


GWGW) = So x@Ged@)= YS) xw(a)CRv@)G). (2.2.16) 


«mod N xmod N 


Applying (2.2.7), we get 


PaG)=G(v)= YS) Civ) 


ymod N 
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so that (2.2.16) becomes 


S> Odeon = S> vy)Grayod) = 


z,ymod N ymod N 
So dy)O@) A + WEXH) = JW, GW). 
ymod N 


We now establish some congruences useful in primality testing. Let p and 
q be primes, p|(q— 1), v a Dirichlet character of degree p modulo g. Choose a 
generator t = ty of (Z/qZ)* and put np = x(tq). This is a primitive p*® root 
of unity, and G(x) € R = Z[qG,¢q] = ZlCpq]. Now let 1 be a prime distinct 
from p and q. From (2.2.7) one deduces that 


G(x)! = x()' G(x!) mod IR. (2.2.17) 


Iterating this p — 1 times, we obtain 


so that 
GO)?! = x(D7! mod LR, (2.2.18) 
because /?~! = 1 mod _ p. Now (2.2.18) can be rewritten in the form 
(Gx -Y/” = XD) mod IR 


which generalizes the formula (2.2.13). 

It is important that G(x)? belongs to the smaller ring Z[¢,| (for p = 2, 
this is just Z). Moreover, it can be expressed via Jacobi sums: for p > 2 we 
have 


G(x)? = x(-1)q II I(x; x°)- (2.2.19) 


To prove this identity, it suffices to multiply termwise the formulae 
G(X)G(X’) 

COP) 
taking into account (2.2.8) in the form G(y?~')G(x) = G(X)G(x) = x(—1)¢. 


One uses (2.2.19) in conjunction with a congruence due to Iwasawa ([Iwa75], 
Theorem 1): 


SIG NOS Lie scp — 2) 


I(x*,x°) = —1 mod (A)’, 
where (A) = (1 —¢,) is a prime ideal of Z[¢,]. Therefore, 
G(x)? = —x(—Lg mod (A), (2.2.20) 


which becomes an exact equality for p = 2. 
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2.2.3 Detailed Description of the Primality Test 


a) In the preliminary stage (cf. section 2.2.1 d), one calculates the number 
t= ear p; which is the product of the initial primes satisfying (2.2.1) 
and (2.2.2): 


t< log n© 108 18 log 2 s= II q> Vn. 
q.q-l|t 


As we have already mentioned, for n < 10°°° we can take t = 2-3-5- 
7-11-13-17-19. In general, to find t one uses a trial-and-error method, 
and the primality of the Euclidean primes is tested by the primitive case— 
by-—case check. Since each q is bounded by t+ 1, and the number of q’s 
is bounded by a(t + 1) < t+ 1, this preliminary stage requires no more 
than log n° !°8!08!8" operations, with an effective positive constant c3. 
At this stage, one should also check that (n,s) = (n,t) = 1 (otherwise n 
is composite, and the algorithm stops). 

b) The necessary conditions of primality, essentially of the type (2.2.18), are 
then checked for every pair p,q with p|(q — 1),q|s, and every Dirichlet 
character x mod q of degree p. It is convenient to fix p and vary q. For 
each q, one calculates a generator t, of (Z/qZ)*. The Dirichlet charac- 
ters correspond to primitive roots of unity 7. The primality condition 
corresponding to (p,q, x) is 


G(x)" 1 = n(x) mod nF, (2.2.21) 


where 7(x) is a p*® root of unity (for prime n, n(x) = x(n), in view of 
(2.2.18)). To check (2.2.21), one expands the left hand side with respect 
to the Z—basis of R = Z|¢,,¢q] and compares it with the right hand side 
coordinate—wise. 

c) If all the congruences (2.2.21) hold true, one calculates a set containing 
virtual prime divisors r of n not exceeding \/n. We shall first explain how 
this is done in the simplest case when n?~! — 1 is not divisible by p? for 
any p. Then we have simply 


r=n'(mod s) for some i € {0,1,...,t}. 
In fact, if r|n, put 
ly(r) = (r?-! — 1)/(n®-* — 1) mod p,lp(r) € Z/pZ. (2.2.22) 
Then 
Let) =.) EO en) HAs (2.2.23) 


If r is prime, it follows from (2.2.18) that 
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GQ)” =x(r)~' mod rR. 


Let us write (r?~! — 1)/(n?~! — 1) in the form a/b where b = 1(mod p), 
so that [,(r) = a(mod p). From (2.2.21) and (2.2.22) it follows that 


p= aie a a 
x(r) = x(r)? = Gh? -Y = GQ)” -Y = n(x)* mod rR, 
and finally 


x(r) = n(x? — (r > 2). (2.2.24) 


The additivity property (2.2.23) then shows that (2.2.24) holds for all 
divisors of n, not only the prime ones. In particular, for r = n, we find 
n(x) = x(n), because |,(n) = 1. 

Summarizing, we established, that if n?~' — 1 is not divisible by p?, then for 
any triple (p,q, x) we have 


so that r= n' mod q where i =1,(r) mod _p for all p. 

d) In general, we have n?~=! —1 = pu, h > 1, p does not divide u. The 
calculations become longer, but the running time is still bounded by 
log nes leslos” for a possibly larger constant c. Again, for every triple 
(p,4,xX) we have the congruence (2.2.21): 

h 


G(x)? “= n(x) mod nR, h=h(p,q,x) = 1. 


Let us define w(x) as the smallest i € {1,2,...,h} such that G(x)? 
is congruent to a power of ¢, modulo nR. If w(x) > 2, the number 


G(x)?" = (G(y)P)P"™ x belongs to the ring Z[¢p| with the Z—basis 
{1,¢p,.--,¢P-7}. At this stage, one must check the following auxiliary con- 
dition: 
for every j € {0,1,...,p—1}, at least one (2.2.25) 
of the coef ficients of 


wx)—lay j 
G(x)? aut 


with respect to this basis is relatively prime to n. 


If this assertion is wrong, n is composite, because it has a non-trivial 
common divisor with one of the coefficients. Otherwise, one can prove, 
as above, that r?-! = 1(mod p’©) for all r|n, and that for all triples 
(p,q, xX) with a given q one has 


x(r) = x(v’) for a certain i € {0,1,...,t}, (2.2.26) 


where vy mod gq is the uniquely defined residue class for which 
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vv) =n'(x), 10) = GO?" mod nR. (2.2.27) 


One can also determine the root of unity x(v) € Z[¢,] using Jacobi sums 
(cf. [LeH.80]). Choose a,b € Z such that 


p’ does not divide ab(a+ b), p® does not divide ((a + b)? — a? — bP) 


(e.g., a = b= 1 for p < 3.10°, p £ 1093, 3511). Using (2.19), one can 
prove then that 
v(x) = J(x*,x°) mod nZ[Gp]- 

e) We must now synthesize all the calculations to obtain a residue class v 
modulo s such that every potential divisor r|n,r < ./n, satisfies a congru- 
ence r= vmod s for some 0 < i < t. In view of the Chinese Remain- 
der Theorem, it suffices to determine for every g|s a power k such that 
v= if mod gq. To this end, we choose for every p|(q — 1) a character x 
with (tq) = Cp. From (2.2.27) it follows that (tk) = ¢k = n!(x), which 
defines k mod p and finally vy mod s. 

f) It remains to check whether one of the numbers r; defined by 


ri =v'mod s, 0<1r<s, 0O<i<t, 


actually divides n. 


A number n which passes all these checks is prime. In practice, this algo- 
rithm is quite fast (cf. [CoLe84], [Vas88]). 


Primality testing can often be speeded up by the following elementary 
observation. If s is a square-free divisor of n — 1, and if for every q;|s there 
exists such an a; € (Z/nZ)”* that 

gcd(al”))/* _1,.n)=1, a®-!=1mod n, (2.2.28) 


uv 


then each prime divisor p of n is congruent to 1 modulo s. In fact, from (2.2.28) 
it follows that the order of a‘°—)/% in (Z/pZ)* is equal to q;. Since qi|(p—1), 
we have s|(p — 1). In particular, if s > ./n, then n is prime. Of course, to 
apply this observation, one must know a sufficiently large divisor s of n — 1. 

A variant of this idea is used in some related primality tests in [LeH.80], 
[GK86], and in the ECPP, see section 2.2.6. This trick was also used in a proof 
that R031 is prime [WD86], where R, = (10” — 1)/9. It is known that for 
lesser values of n, only R2, Rig, R23, and R317 are prime. A very nontrivial 
prime decomposition of R71 was given in the equality (2.1.1). 

Since the work of Goldwasser and Kilian, a general primality test was de- 
velopped by Atkin-Morain which has probably polynomial time (see [AtMo93b], 
and the end of this section for a discussion of the ECPP (Elliptic Curve Primal- 
ity Proving). Adleman and Huang in [AdHu92] modified Goldwasser-Kilian 
algorothm to obtain a randomized polynomial-time algorithm that always 
produced a certificate for primality. 


78 2 Some Applications of Elementary Number Theory 
2.2.4 Primes is in P 


manindra@cse.iitk.ac.in, kayaln@iitk.ac.in, 
nitinsa@cse.iitk.ac.in 

Let us describe now a resent discovery that primes are recognizable in 
polynomial time. This is the work of M. Agrawal, N. Kayal and N. Saxena 
who found that a polynomial version of Fermat’s Little Theorem (1.1.2) led to 
a fast deterministic algorithm for primality testing: the time of this algorithme 
is given by O(log'* n), where the notation O(t(n)) for O(t(n) - poly(logt(n))) 
is used for a function t(n) of n. 

The algorithm is based on the following polynomial version of Fermat’s 
Little Theorem (1.1.2): 


Theorem 2.1. Let p be an integer, and a an integer such that gcd(a, p) = 1. 
Then p is a prime iff (a — a)? = x? — a(modpZ[z]). 


Let n be the given number whose primality or compositeness is to be 
determined. If n is prime then obviously the test 


? 


Test(a,r): (a —a)” = 2” — a(mod(a” — 1,n) (2.2.29) 


will succeed (give the answer “true”) for all integers a and r. The result of 
Agrawal-Kayal-Saxena says that conversely, if Test(a, 7) is true for all integers 
a and r in the range 0 < r < log®n, 0 < a < log*n, and n has no prime 
factors < log* n, then n is prime or a power of prime. Here and from now on 
all constants implied in the sign < are absolute). 

Since performing Test(a,r) takes time ar most O(r?log?n), or even 
O(r!** log?** n), if FFT is used for multiplication of polynomils and of num- 
bers modulo n, this gives a deterministic polnomial-time algorithm as desired, 
since obviously checking that n is non-trivial power can be done in polynomial 
time. Recall that the Fast Fourier Transform (FFT) is a fast algorithm which 
reduces the number of multiplications of coefficients needed for of the multi- 
plication of polynomials of degree r from O(r?) to O(rlogr), and the FFT 
reduces the time of multiplication of two integers modulo n from O(log n?) 
to O(log nlog log n). In fact, the result actually proved by them is somewhat 
stronger: one can find in a deterministic way a single number r < log® n for 
which the validity of Test(a,r) for all a < log*n suffices to imply the pri- 
mality of n. This improves the maximal running time of the algorithm from 
O(log'*** n) to O(log'*** n). The actual running time on a well known hy- 
pothesis on the density of Sophie Germain primes is in fact only O(log®T* n) 
(Sophie Germain primes are odd primes q such that r = 2q+1 is also a prime): 


Conjecture 2.2 (On the density of the Sophie Germain primes). 


#{q<a|qand2q+1 are primes} ~C- ie 


logs? 
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Noice that there is an obvious analogy with the asymptotic law of the distri- 
bution of primes (1.1.14) (the density of all primes less or equal to x): 

fas x 

#{q< «| qis prime} ~ kK -—— 

logax 


More presisely, the results of Agrawal-Kayal-Saxena, which imply the 
above statements, are as follows. Here P(m) denotes the largest prime di- 
visor of an integer m ond o0,(m) the order of m(modr), where r is any prime 
not dividing m. 


Proposition 2.3. For any n, there is a primer <log® n such that P(o,(n)) > 
2,/r log n 


Proposition 2.4. Let n be an integer andr {n a prime satisfying 


(a) P(or(n)) 2 1, 
(b) Test(a,r) is true fora =1,2,...,1, 
(c) n has no prime factors < l, 


where | = 2,/rlogn. Then n is a power of a prime number. 


The proof uses a result of Fouvry [Fou85], and Baker-Harman [BaHa96], 
which says that P(r — 1) > r2/° for a positive proportion of all primes r. 
This result, proved with sieve theory, is difficult but not surprising since it 
is easy to see that P(m) > m?/> (or even P(m) > m® for any fixed c < 1) 
for a positive proportion of all integers m. (The number of m < x having a 
prime factor g > «° for c > 4 is S- [x/q], which is asymptotically equal 


af<qsa 
qprime 


to log(1/c)x for x large.) We will show that, for C' sufficiently large absolute 
constant, there exists for every n a prime number r satisfying 


(2log n)® <r < (Clogn)®,0,(n) > r¥/3, P(r — 1) > r?/°. (2.2.30) 


Indeed, by the result just quoted, the number of primes r < x = (Clogn)® 


with P(r — 1) > r?/? is at least cn(x) ~ fee where c is an absolute 


constant. The number of primes < (2logn)® is < eee and the number of 
C*log n® 


r< <2 with o,(n)<rlfis< 
log log n 


, since all these r divide the number 


gi/3, . 
N :=]]j_, (v’ — 1), and 
gi/3 


} 1 2/3 |, C4 5 
number of prime factors of Nis << s i lice K = Deen es whe 
e) 

j=l 


2(j log n) loglogn — loglogn* 
log n® 


log log n 
(2.2.30). Let r be such a prime and gq = P(r — 1). Then q is a prime dividing 


It follows that for C sufficiently large there are >> primes satisfying 
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r —1 but not (r —1)/o,(n) (since (r — 1)/o,(n) < r?/3 < q ), so q divides 
o,(n) and P(o,(n)) > q > 12/3 > 2V/rlogn 

The idea of the proof of Proposition 2.4 is to consider the integers n*p? 
and n*p', and to show that there exist two different couples (i,j) and (k,1) 
such that n‘p? = n*p!( mod r). 

Set q = P(o,(n)). Since q is a prime, g must divide o,(n) for some prime 
divisor p of n. The field extension K = F,[¢], where ¢ is a non-trivial r‘® root 
of unity, has degree d = 0,(p) > gq. Let G be the subgroup of K™* generated 
by ¢-—1, ¢-—2,..., ¢—1. Then we have 


|IG| > c a 7 ') Q23i 


because the elements (¢ —1)41(¢—2)” ...(¢-1)™ (d; > 0, ¥> di < d) of G are 
distinct. (The linear functions «—1, ..., x —1 are distinct irreducible polyno- 
mials modulo p because of assumption (c), and ¢ cannot satisfy a polynomial 
equation of degree < d.) On the other hand, we claim that 


Gh art (2.2.32) 


if n is not a power of p, and this proves the proposition since (2.2.31) and 
d>q=>1 imply 


IG > aa > 7") > ef = n2vF, 

To prove (2.2.32), we denote by a; (s € Z) the automorphism of K induced 
by ¢ + ¢*. We have o,(g) = g? for any g € K* and o,(g) = g” for any 
g € G by virtue of assumption (b), so o,(g) = g® for any s in the form np’. 
Let S = {np’ | 0 < i,j < \/r}. If n is not a power of p, then these elements 
are all distrinct, |S| > r and we can find s 4 s’ € S with s = s’(modr). But 
then g° = 03(g) = os = g*. Taking for g a generator of the cyclic group G 
we deduce from this that |G| < |s — s’| < n?V", as desired. 

The algorithm to check primality is therefore as follows. First check that no 
root n!/*(2 < k < logy n) is integral. Then check succesive primes r > 4log? n 
until one is found for which r — 1 has a prime factor g > 2,/rlogn with 
n’-))/4 # 1(modr). By Proposition 2.3 the smallest such r is < log® n. Now 
check (b) and (c) of Proposition 2.4; n is prime if and only if both hold. 


Remark 2.5. It seems that the smallest r satisfying the condition of Propo- 
sition 2.3 is not only < log®n, but is very close to the minimum possible 
value ro = [4log?]. For instance, a two-line PARI program checks that for 
n = 109° + 1, already 1908707, the second prime > ro, works and that for 
n = 10 + 1(j < 300) one never needs to try more than 10 primes, or to go 
further than rp + 186, before achieving success. (Total computation time is 
about 2 seconds on a SUN work station). 


The given version is due to Dan Bernstein, who slightly improved the 
original version of August 6, 2002; his contribution is the use of the inequality 
(1-1) > n2V™, cf. [Mor03], [Ber03], and [Mor03al. 
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2.2.5 The algorithm of M. Agrawal, N. Kayal and N. Saxena 


(see [AKS], p.4), and [Mor03al], p. 4.) 
Input: integer n> 1 


1. if (mn is of the form a a’, b> 1) output COMPOSITE; 
OE pi A 

3. while (r<n) { 

4. if (r is prime) 

5. if r divides n output COMPOSITE; 

6. find the largest prime factor q of r—1; 
7. if (¢q>4/rlog,n) and na #1 modr ; 
8. break; 

9. ri=rt+l; 

10. } 

11. for a=1 to 2/rlog,n 

12% if ((a—a)” = («"—a) mod (a"—1,n).) output COMPOSITE; 


13. output PRIME; 


Theorem 2.6. The algorithm produce PRIME if and only if n is prime. 


Remark 2.7. Practically, one can certainly find r of the size O((logn)?) in 
order to satisfy the conditions in the algorithm. This leads to the estimate of 
the complexity O((logn)°) in the best case. 


2.2.6 Practical and Theoretical Primality Proving. The ECPP 
(Elliptic Curve Primality Proving by F.Morain, see [AtMo93b]) 


The questions of practical primality proving of numbers with thousands of 
digits and the questions of the mass production of large primes are discussed 
in [Mor03a]. 

It is noticed by F.Morain that even the algorithm of Miller is already 
long, because it needs to compute numerous modular exponents. The quantity 
(log n)® in AKS gives an idea of the order of the degree of polynomials with 
which one needs to work. In practice, it is almost certain that one can find 
an r = c(logyn)® with c > 64. For example, if n = 2°!?, then in the most 
optimistic case r = 64(log, n)® = 274 > 16- 10°, leading to manipulate with 
dense polynomials containing more than 1 Gbytes, which is already rather 
difficult. 

Suppose that we wish to prove the primality of the number n = 10° + 7 
(which is a prime). Using an implementation of AKS by E.Thomé with GMP 
4.1 on a PC with 700 MHz, one takes r = 57287 which leads to s = 14340 (see 
[Mor03a]). Each intermediate computation takes 44 seconds, giving a total 
time of more than 7days. If one uses directly the condition (2.2.29), one can 
take (r,q, 8) = (3623, 1811, 1785), and this takes 1.67 x 1785 seconds or about 
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49 minutes. The best triplet is (r,q,s) = (359,179, 4326), leading to a total 
time of 6 minutes and 9 seconds. 


One could compare these algotithms with the algorithm using Jacobi sums 
(cf. [Coh96] for a presentation of this algorithm which is close to one presented 
in Section 2.2.1), and with another efficient algorithm, the ECPP (Elliptic 
Curve Primality Proving) by F.Morain. 

The ECCP produces really rapidly a certificate (in O((log n)*)) using ellip- 
tic curves EF over Z/nZ, and using an elementary observation on congruences 
(2.2.28) adopted to the groups like E(Z/nZ). Such a certificate is the program 
which produces a long list of numbers that constitute the proof of primality for 
that number. In brief, a decreasing sequence of primes is built, the primality 
of the successor in the list implying that of the predecessor. 

The ECCP can even prove the primality of numbers in 512 bits in 1 sec- 
onde, and that of 1024 bits in 1 minute, and that for 10000 bits in a rea- 
sonable time (of about one month). According to [Mor03al], it seems that 
even if one succeed to lower the number r in the algorithm AKS, it will not 
produce an algorithm, which is practically more efficient than the ECPP, cf. 
http://www.lix. polytechnique.fr/Labo/Francois.Morain/Prgms/ 
ecpp.english.html. 


2.2.7 Primes in Arithmetic Progression 


An important recent discovery in [GrTa] by B.J. Green and T.Tao says 
that the primes contain arbitrary long arithmetic progressions (cf. [Szm75], 
[Gow01], but also http://primes.utm.edu/top20/ for interesting numerical 
examples of long arithmetic progressions of consecutive primes). 

It was a well-known classical folklore conjecture that there are arbitrarily 
long arithmetic progressions of prime numbers. In Dickson’s History of the 
Theory of Numbers [Dic52] it is stated that around 1770 Lagrange and Waring 
investigated how large the common difference of an arithmetic progression of 
L primes must be. 

It was proved in [GrTa] that there are arbitrarily long arithmetic progres- 
sions of primes. There are three major ingredients. The first is Szemerédi’s 
theorem, which asserts that any subset of the integers of positive density con- 
tains progressions of arbitrary length. The second is a certain transference 
principle. This allows one to deduce from Szemerédi’s theorem that any sub- 
set of a sufficiently pseudorandom set of positive relative density contains 
progressions of arbitrary length. The third ingredient is a recent result of 
Goldston and Yildirim, cf. [GoYi03]. Using this, one may place the primes 
inside a pseudorandom set of “almost primes” with positive relative density. 

It was found in 1993 by Moran, Pritchard and Thyssen (cf. [MPTh]) that 
11410337850553+4609098694200k is prime for k = 0,1,...,21. In 2003, Markus 
Frind found the rather larger example 376859931192959 + 18549279769020k 
of the same length. Main theorem of [GrTa] resolves the above conjecture. 
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Theorem 2.8 (Theorem 1.1 of [GrTa]). The prime numbers contain arith- 
metic progressions of length k for all k. 


A little stronger result was established: 


Theorem 2.9 (Theorem 1.2 of [GrTa]). Let A be any subset of the prime 
numbers of positive relative upper density, thus 


lim sup 7(N)~'|AN [1, N]| > 0, 


N-oco 


where 1(N) denotes the number of primes less than or equal to N. Then A 
contains arithmetic progressions of length k for all k. 


2.3 Factorization of Large Integers 


2.3.1 Comparative Difficulty of Primality Testing and 
Factorization 


Let n > 1 be an integer. The problem of finding integers a,b > 1 with n = 
ab can be divided into two steps: first, to establish their existence (this is 
solved by any primality test), second, to find them explicitly (factorization). 
In practice, the primality test described in §2.2.1 does not give a concrete 
divisor of n. In fact, when n fails such a test, it usually fails already one of 
the necessary conditions in §2.2.3 b), so that the algorithm stops before we 
come to the stage of calculating potential divisors. Therefore, this algorithm 
factorizes only primes and those n which admit small divisors, namely, divisors 
of the numbers s and ft, defined in §2.2.1. 

As we mentioned in §2.1, an efficient factorization algorithm could be used 
for breaking a standard public key cryptosystem. For this reason, factorization 
has become an applied problem, attracting considerable effort and support 
([Pet85], [Sim79], [Kob94]). However, the running times of the best known 
factorization algorithms do not allow one to factorize a product n of two 150— 
digit (decimal) primes. The theoretical bound (cf. [Coh2000]) for this running 


time is of order 
3/64 
exp ° log n- (loglogn)? | , (2.3.1) 


and for a 300-digit n they may require billions years. This made Odlyzko ask 
whether we now see the actual level of difficulty of the factorization problem 
or whether we are just overlooking something essential, cf. [Pet85]. 

Anyway, the progress in factorization of some concrete large integers 
([Wun85], [Wag86]) relied more on the new hardware or parallel computa- 
tion schemes, than on the discovery of conceptually new algorithms, cf. more 
recent developments in [Ma99]. 


2.3.2 Factorization and Quadratic Forms 


Ifn = x?—y?, then x—y is in most cases a non-trivial divisor of n. This simple 
remark leads to the “Fermat factorization algorithm” which generally requires 
O(n'/?) operations but is more efficient if n is a product of two numbers t, s 
with a small difference. Then n = x? — y? where x = (t+ s)/2, y = (t—)/2. 
The algorithm consists of calculating x? —n for x starting with [,\/n]+1 until a 
perfect square is found. Similar considerations can be useful in other problems 
([Bril81]). One can also generalize this trick and use other quadratic forms in 
factorization algorithms ([Kob94], [Ries85]). 

Consider an imaginary quadratic field Q(./—n). Let n be square-free. De- 
note by Cl(A) the ideal class group of this field (cf. §1.2.8, §4.2.2). The ele- 
ments of this group may be identified with the classes under Z-equivalence of 
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the primitive, positive definite quadratic forms f(x,y) = ax? + bry + cy? with 
negative discriminant A = b? — 4ac, where A = —n ifn =3 mod 4, A= —4n 
ifn = 1 mod 4. (Here we assume n to be odd). Denote by a = (a,b,c) such 
a form. We shall call a@ ambiguous if it belongs to one of the types (a,0,c), 
(a, a,c) or (a,b, a) ([Gau], [Shan71]). The discriminant of an ambiguous form 
has the explicit factorization: —A = 4ac (resp. a(4c — a), (2a — b)(2a + b)) 
for a = (a,0,c) (resp. (a, a,c), (a, b,a)). One easily sees that a converse state- 
ment is also true (cf. [BS85]): a factorization of A of this type determines an 
ambiguous form. On the other hand, there are independent methods for con- 
structing ambiguous forms which are based on the following property: they 
represent elements of order two in the class group Cl(A). In 1971 D.Shanks 
devised a rather fast algorithm allowing one to factorize n in O(n!/+) opera- 
tions and to determine the structure of the group Cl(A). This method uses 
the analytic formula due to Dirichlet: 


th(A) 


L(1,xa) = AI 


(h(A) = |Cl(A))). 


Here ya(m) = (4), and L(1,xa) is the value at s = 1 of the Dirichlet 
L-function 


Co 


E(s,x) =D xlra)n™* = TT — x(p)p*y™. 


m=1 p 


The approximate formula 


_~VAl 2 
h(A) % i eney 


is valid with a relative error < 0.1% for P > 132000. The elements of the class 
group are constructed with the help of small primes p such that (2) = 1. 
They are represented by the forms F, = (p, Bp, Cp) whose coefficients satisfy 
the discriminant relation A = B? — 4pC, and are found from the condition 
A = B? mod p. Knowing the class number h(A) = |Cl(A)|, we can construct 
the second order elements starting with x = F,, calculating its maximal odd 
power dividing h(A), and then consecutively squaring until we get 1. 


2.3.3 The Probabilistic Algorithm CLASNO 


(cf. ([Pom87], [Sey87]). The idea of using Cl(A) in factorization algorithms 
can be considerably improved. In this algorithm, one bypasses the calculation 
of h(A), and the running time is estimated by L = exp(./logn - log log n), 
which grows slower than any positive power of n. Assume first that the prime 


divisors of h(A) are small, or, rather, that h(A) divides k! for a small k. 
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Take a random element « € Cl(A), say, x = F, for some p with (4) =1 


and calculate B, = x°44 power of Kk! Then an element of order 2 should be 
contained in the sequence of consecutive squares of B,. We need not know 
the exact value of k; we just hope that some small k will do. If we succeed, 
we factorize A in O(k) operations. If we fail, we can try the same trick for the 
field Q(,/—an) where a is a small square-free number. 

In order to justify this procedure in general, one assumes that for variable 
a, the class number h(A,) of Q(./—an) behaves like a random number varying 
in a neighbourhood of n!/? (this estimate follows from the Dirichlet formula). 
One can then estimate the probability that h(A,) will be composed of only 
small primes. To this end, denote by W(x, y) the number of natural numbers 
< «x not divisible by any prime > y (they can be called “y-smooth”). Put 
k = L®, a > 0. The probability that a random number of order n'/? is 
L*smooth is U(n'/?, L°)/n'/?. We must now understand the behavior of 
W(x, y)/y. Dickman (cf. in [Hild86]) has shown that this depends essentially on 
the value of log x/log y. Namely, for every u > 0 the limit limy... Y(y", y)/y 
exists. This limit is called the Dickman function p(w) and is uniquely defined 
by the following properties: 


forO <u<1, p(u) =1, 


p(u= 1) 


for u>1,p'(u) = — 
u 


At u=1, p(u) is continuous. As u — 00, p(u) = ul—1+°))", De Bruijn (de 
Bruijn N.G. (1951)) proved that 


Vy", y) = y"p(u) (1 +O. (“HJ 


where y > 2, 1 < u< (logy)?/5~¢ with a positive e. 
In our case, however, L% grows slower than any positive power of n so that 
Dickman’s theorem is not applicable. The necessary estimate has recently been 


obtained: 
O(n? Lo) cy nl? TAS FoR), 


For more details, see [Hild86]. 

Returning to the factorization algorithm under discussion, one sees that 
its running time for a given k = [L°] is bounded by L®% and the probability of 
success is about L~!/4°. Hence the total number of attempts should be about 
L\/4@, and the total running time will be bounded by L°t+(/4M+¢, € > 0. This 
estimate is minimized by choosing a = 1/2 (ie. k = L'/?), and the result is 
then L'+*, Of course, theoretically we may get stuck on an especially bad n, 
but this is quite improbable. 


Let us illustrate the estimate TERT =u“ when u is much smaller than 
vy 
y (for a simple proof of this see |Kob87], p. 137). For example, take y ~ 10° 
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(so that m(y) ~ 7.104 and logy ~ 14) and x = 1048. Then the fraction of 
natural numbers < 2 which are products of primes < y is about 1/274. 


2.3.4 The Continued Fractions Method (CFRAC) and Real 
Quadratic Fields 


(cf. [Kob94], [Wun85], [Ries85], [Wil84]) . Improving the Fermat factorization 
method, let us try to seek solutions x, y of the congruence x? = y? mod nsuch 
that x is not congruent to ty mod n. Then ged(a + y,n) or ged(a — y,n) 
is a non-trivial divisor of n because n divides (a + y)(x — y) but neither 
x+y nor x — y. Let us look for x among products of such numbers «; that 
the residue 7? mod n with the smallest absolute value is a product of small 
primes. Then y will also be a product of these primes. More precisely, consider 
aset B= {pi,po,..-,pn} all of whose elements are primes, except possibly p; 
which can be —1. Let us call such a set a factorization basis for n. We shall 
refer to any integer b such that the residue of b? mod n with the smallest 
absolute value is a product of (powers of) elements of B as a B-number. Let 


x; be a family of B-numbers, a; = Tes iad the respective minimal residues 
of x? mod n. Put 


h = 
€; = (6:1, €:2,---,€in) € F3, where €;; = a;; mod 2. 


Suppose that the sum of vectors €; vanishes mod 2. Put 


«= |[ 2; mod nN, y= |]. 
i 


where 


Then 2? = y? mod _ n. 


Example 2.10. ([Kob87], p. 133). Let n = 4633, B = {—1,2,3}. Then x = 
67, %2 = 68, x3 = 69 are B—numbers, because 


677 = —144 mod 4633, 687 =—9 mod 4633, 59? = 128 mod 4633. 


Moreover, €; = (1,0,0),€2 = (1,0,0),¢€3 = (0,1,0), so that we can put x = 
£122 = 67.68 = —77 mod 4633, c = 27237 = 233? = 36. Besides, —77 is not 
congruent to +36 mod 4633. Summarizing, we obtain a non-trivial divisor 


41 = ged(—77 + 36, 4633) of n = 4633. 


Of course, if we are unlucky, it may happen that x = ty mod n. Then 
one should choose a new x; or even a new B. An efficient method for seeking 
B-numbers utilizes continued fractions of real quadratic irrationalities. Let 
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x > 1 bea real number, x = [ao,a1,...] its continued fraction expansion. 
Put A;/B; = [ao,a1,...,a;]. These convergents can be calculated from the 
relations A_» = By = 1,A_, — B_» = 0 and A; = a;A;_1 + Aj_9, B; = 
aj; Bi-1 + Bia. From the relation 


A; Ajay = ( it} 1 
B, Bin Bi Bi sa 
it follows that 
|A? — 2? B?| < 22, (2.3.2) 
because 
A; A; 1 1 
A? — »? B?| = B? |= — 2}. |= 2 Be 2x + 
| 7 zr u | a B; v B; + w v By Bo t B, Bo 


In particular, we can find the continued fraction expansion of « = \/n with 
the help of the algorithm described in §1.4, and a; form a periodic sequence. 
Since A? = A? — nB? mod n, (2.3.2) shows that the absolute value of the 
smallest residue of A? mod n is bounded by 2\/n which can help in looking 
for B-numbers. However, A; quickly become large even with respect to n, and 
to facilitate the calculation of A? mod n one can use the congruence 


A?_, =(-1)'Q; mod n, (2.3.3) 
where Q; is the denominator of x; = (\/n + P;)/Qi, of A? mod n, that is, 
Vn = [a0, 41, 42,..., aj, Xj]. 
In fact, applying formally the recurrence relations to ./n we get 


_ Ain1ti + Ai-2 _ Ai-1/n + PiAi-1 + QiAi-2 
Jn =r£= = t 
Bye1%4+ Bi. BiiJv/n+ PjBi-1+Q:Bi_2 


Comparing the coefficients at 1 and \/n, we obtain 
QiAi-2 + P)Ai-1 = nBi-1, 


Qi Bi_2+ P,Bi1 = Aj-1. 
Solving this for Q;, we see that 


(A;-2By-1 — Ai-1Bj-2)Q; = nB?_, — A?.,. 


But the coefficient at Q; equals to (—1)'~'. This proves (2.3.3). Recall also 
that P;,Q; can be calculated using a very efficient algorithm which we restate 
in aslightly changed form. Let 19 = (Po+1/n)/Qo be a quadratic irrationality, 
with Qo dividing n — P?. Put x; = (P; + V/n)/Qi. Then 
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Pui = = aiQ; 4 Pi, ay= [P; + Vn/Qi], (2.3.4) 


Qitt = Qi-1 + (Pi — Piti) ai. (2.3.5) 


This follows directly from 2;41 = 1/(a; — aj), or 


If this method does not provide us with the required amount of B- 
numbers, we can repeat the calculations with an instead of n where a is a 
small square-free number. The number of operations required is estimated by 


LV3/2 — exp (ee oioe) 


(compare with section 2.3.3), and the practical efficiency of this algorithm 
was demonstrated by its application to the Fermat number F7 = 2178 — 1 (cf. 
[MB75], [Wil84]). 

Let us describe also an elegant algorithm SQUFOF due to Shanks which is 
also based upon the arithmetic of real quadratic fields (cf. [Ries85], [Wil84]). 
It consists of two stages. 


1) Put ro = Vn, that is, Po = 0,Qo = 1 in the formulae (2.3.4), (2.3.5). 
Calculate x, until we find an odd integer m such that Qm_—1 = t? for some 
natural t. From (2.3.3) it follows that A?,_, = ¢t? mod n. Presumably, one 
can then find a divisor of n with the help of the Euclidean algorithm as 
gcd(Am—2 + t,n). In practice, however, A,,—2 is usually too large to be 
calculated directly, so that one changes tactics. 

2) Put Po = Pn, Qo = t,o = (Po t+ Vn)/Qo_and calculate the tails of the 
continued fraction expansion of Zp, 7; = (P, + Jn) /Q;. We perform this 
until we find such #, that P, = P,41. From (2.3.4) and (2.3.5) it follows, 
that 


GiqQq=2P,,  Qq divides n — P?. 


Hence either Oe or OG; /2 divides n. If this divisor is trivial, one should 
again replace n by an for a small a and repeat the calculations. Using a 
calculator for factorizing a number < 10°, it is convenient to write the 
intermediate results in a table. Table 2.1 illustrates the course of calcula- 
tions for n = 11111 = 41-271. In general, g is about m/2 (in our example, 
m= 7,q = 4. The algorithm is based on the fact, that the fractional ideal 
(1, Zo) is of order two in Cl(4n), and on the second stage we calculate the 
corresponding ambiguous form in disguise. The number of operations is 
estimated by n!/4. 
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Table 2.1. 
0 1} 2) 3] 4] 5 6 7 
105) 2] 2]; 4/5 |] 2 7 
105] 67 | 87 | 97 | 88 94 81 
1 86 | 77 | 46 | 37 | 91 ]25 = 5? 
37 | 3 | 1] 1] 3 
81 | 104] 73 | 25 | 82 | 82 
5 | 59 | 98 | 107] 41 | 107 
Since 


_ 82+ V11111 V11111 
Che sea e a <a 


the ideal (1,%4) corresponds to the ambiguous form (41,0,-11111/41), or 
(41,0,-271), with the discriminant 4n. 


2.3.5 The Use of Elliptic Curves 


The general idea of utilizing the calculations in a finite group (such as class 
group) in order to factorize n found unexpected implementations using groups 
of different types. 


a) Pollard’s (p—1)—method. Suppose that n has such a prime factor p that the 


order of (Z/pZ)* is “smooth”, that is, p—1 divides k! for a not too large k, 
say, & < 100000. Then we can proceed as follows: calculate consecutively 
a; = 2"—1 mod n using the recursive relation a’t! = (a;+1)'t!—1 mod n 
and find gcd(az,n); it will be divisible by p in view of Fermat’s little 
theorem. This will fail if there are no p|n with smooth p — 1 ([Pol74]). 
For a change, one can try to use the multiplicative groups of fields F,r of 
order p’ — 1. For r = 2, we obtain the Williams p+ 1—-algorithm ([Wil82]). 


b) Much wider perspectives of varying the finite group in the factorization 


algorithms are opened by elliptic curves over finite fields. Their use leads 
to one of the fastest known factorization algorithms requiring O(L'**) 
operations [LeH.87]. 


Choose a random elliptic curve I and a point P on it. To this end, choose 
random integers a, 29, yo and put b = y@ — 4x3 — azo, P = (xo, yo). Then 
P = (20, yo) is a point on the curve defined by the equation 


rT: y’? = 42° + ax =b 


(cf. §1.3.3). It is an elliptic curve over Q, if the right hand side cubic 
polynomial has no multiple roots. We may also assume that the discrimi- 
nant of this polynomial is relatively prime to n; otherwise we either get a 
non-trivial divisor of n, or must change the curve. 
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In the projective plane, I’ is determined by the homogeneous equation 
YiZaAx Sax ZZ (Hay HZ). 


Reducing it modulo a prime p, we obtain an elliptic curve over F, = Z/pZ. 
Its identity is Or = (0: 1: 0), and the order of I'(F,) equals p+ 1 — a, where 
|ap| < 2,/p (Hasse’s theorem, see §1.3.3). 

Assuming now that (p+ 1-—a,) | k! for some p|n and small k, we calculate 
consecutively P; = i!P mod n in the projective plane over Z. The prime p 
must divide the Z—coordinate of P;, and the gcd(n, Z;). If we are lucky, O(k) 
operations will provide us with a non-trivial divisor of n. Otherwise one should 
renew the curve, without wasting too much time on an unsuccessful choice 
(“the strategy of early interruption”). In order to optimize the choice of k for 
each test curve and the number of tests, let us take p = 1%. The probability of 
success with k = [L°] is approximately Y(n?, L®) /n® = L~9/20+0) (see 3.3). 
Hence we shall have to try about L°/2 random elliptic curves with a marked 
point, whereas for each of them the number of operations will be estimated 
by L®. The general number of operations L°+®/?° is minimal for a = \/3/2. 
In the worst case, a = 3 = 1/2 we get L1**, € > 0. 

Notice that our estimates are based upon the following heuristic conjec- 
ture: the orders of the groups I'(F,) behave with respect to the smoothness 
property as the random numbers taken from (p — 2,/p+1,p+2,/p+ 1). The 
belief in this conjecture is strengthened by the study of the set of isomorphism 
classes of elliptic curves modulo a prime [LeH.87]. 

We must also notice that some cryptosystems using elliptic curves also 
were suggested [Kob87]. 

There exists a probabilistic algorithm with rigorously estimated running 


time 
O(LV*/?) 


due to [Dix84], and several probabilistic algorithms using linear or quadratic 
sieves, with the expected running time 


O(LY?) and O(L) 


respectively [Pom82], [Wag86]. 

Many more interesting algorithms and computer programs can be found 
in Riesel’s book [Ries85]. One can also find there some heuristic arguments 
in favour of the existence of the algorithms which would be much faster then 
everything we know now. 

More recent information on factoring large integers and new records can 


be found in [Coh2000] and at the Web page of F.Morain. 


Part II 


Ideas and Theories 


3 


Induction and Recursion 


3.1 Elementary Number Theory From the Point of View 
of Logic 


3.1.1 Elementary Number Theory 


Almost all of part I of this book belongs to elementary number theory (ENT). 
This notion can be rigorously defined using tools of mathematical logic, but in 
order to do this one must first introduce a formal language of arithmetic and 
fix an adopted system of axioms (one or other version of Peano’s axioms). In 
order to avoid such irrelevant details, we restrict ourselves to some intuitive 
remarks. In ENT there are some initial statements and some axioms, which 
formalize our intuitive ideas of natural numbers (or integers), as well as certain 
methods for constructing new statements and methods of proofs. The basic 
tool for construction is recursion. In the simplest case assume that we want 
to define some property P(n) of a natural number n. Using the method of 
recursion we explain how one can decide whether P(n + 1) is true if it is 
already known whether P(1), ..., P(n) are true or not. Say, the property “n 
is a prime” can be defined as follows: “1 is not a prime; 2 is a prime; n+1 > 3 
is a prime iff none of the primes among 1, 2, ..., n divide n+ 1”. Analogously 
the main tool in the proofs of ENT is induction. In order to prove by induction 
a statement of type “Vn, P(n) is true” we first prove say P(1) and then the 
implication “Vn the property P(n) implies P(n + 1)”. 

Even in the earliest research into the axiomatics of number theory (Peano, 
Frene) it was established that all the notions empirically thought of as be- 
longing to ENT (such as divisibility, primality etc.), functions (the number 
of divisors, the Euler function y(n), 7(a)) and theorems (Fermat’s little the- 
orem, the quadratic reciprocity law etc.) can be respectively constructed by 
recursion and proved by induction, cf. [Rog67], [Man80]. 

It happens sometimes that a result admits an elementary formulation, but 
its elementary proof is not known. For example, the prime number theorem 
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T(x) ~ 


7 can be stated in an elementary way assuming that x runs only 
og & 
x 


through natural numbers, and replacing log x by the sum S- —; an elementary 
44, 


t=1 
proof of this theorem was found only in the late 40s by Selberg, cf. [Sel51] 
while the analytic proof had been known for half a century. 


3.1.2 Logic 


The study of ENT from the point of view of logic has lead to new concrete 
number theoretical results which we shall discuss below. However the most 
important consequence of this study has been that the place of ENT inside 
mathematics in general has become much clearer. We wish to stress the fol- 
lowing three aspects. 


a) ENT as a mathematical discipline in principle can not be “self-sufficient”. 

For every choice of axioms there will always be statements which can be 
formulated in an elementary way, and which are decidable, but which can 
not be deduced using only elementary methods (cf. the theorem of Gédel 
[G6], discussed in [Man80]). 
Thus the historical tradition of proving number theoretic facts using analy- 
sis (Euler, Jacobi, Dirichlet, Riemann, Hardy, Littlewood, Vinogradov, 
...), geometry (Minkowski, Hermit, ...) and generally all possible tools, 
has deep reasons. 

b) ENT can be used by means of formal logic to model any axiomatized 
mathematical discipline inside elementary number theory (Gédel). In such 
a modeling we forget the contentive sense of the definitions and theorems 
of our theory and leave only information concerning their formal structure, 
and syntactic rules for deducing one statement from others. Enumerating 
by Gédel’s method all syntactically correct statements by natural num- 
bers, we can then write a program or algorithm to list all provable results 
of our theory (its theorems). Thus a theory is modeled by a function 
f:Z* — Z* (the first Zt is a number generating the theorem, the sec- 
ond is the encoded statement in the theory). Instead of asking whether 
the theorem with number n is provable we can ask whether the equation 
f(x) =n is solvable. 

Although the equation f(x) = n is defined in terms of ENT, it is far from 
being a Diophantine equation since the function f is not a polynomial. As 
was shown by Yu.V.Matiyasevich, it is possible to reduce this problem to 
a Diophantine one (Hilbert’s tenth problem), see [Mat04]. 

He showed that one can find a polynomial P7(21,...,2%m;n) with integral 
coefficients such that the solvability of f(a) = n is equivalent to the 
solvability of Pr(x;n) = 0 with x € (Zt)™. The calculation of P; from f is 
completely effective (as is the construction of f given the system of axioms 
defining the initial theory). In this sense the problem of provability of any 
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mathematical result is equivalent to a standard kind of number theoretical 
problem. (The reader who is used to thinking not in terms of “provability” 
but of “truthfulness” must at this point take consciously some intellectual 
precautions. Considering for example the theorem of Gédel - Cohen that 
the continuum - hypothesis is independent of the standard axioms of set 
theory, it is clear that “truthfulness”, as opposed to “provability”, is a rather 
philosophical notion. It would therefore be unreasonable to expect it to 
have a precise mathematical definition.) 

c) ENT provides a framework for the precise formulation and study of the 
notions of algorithm and (semi-)computable function. These notions, im- 
plemented in the theory of recursive functions, turn out to be much more 
universal than one could expect a priory (the Church thesis, cf. [Man80], 
[Rog67], [KMP74]). The theory of recursive functions has both a fun- 
damental general mathematical meaning, and an applied meaning. Its 
methods are used in proving the Matiyasevich theorem mentioned above. 


In the next section we formulate some basic facts from the theory of recur- 
sive functions, which have independent number theoretical interest. We then 
give some precise definitions and hints of proofs. 


3.2 Diophantine Sets 


3.2.1 Enumerability and Diophantine Sets 


Definition 3.1. A subset E Cc (Zt)™, m > 1 is called Diophantine if there 
exists a polynomial with integral (or, equivalently, with natural) coefficients 


P(t,..-,tm,1,---;2n), 


such tha 


(t1,...,tm) € B= > A(a1,..., Un) € Z", P(t, x) =0. 


Every Diophantine set is enumerable in the following informal sense of the 
word: there is a deterministic algorithm, which produces one—by-—one all ele- 
ments of F (a formal definition will be given in the next section). Indeed, let 
us check one—by-one all the elements of Z™*”: substitute them into P and, 
if we get zero, write down the first m coordinates. We thus obtain a growing 
list of elements of EF, which exhausts E when we pass to the limit. 


3.2.2 Diophantineness of enumerable sets 


Theorem 3.2. Conversely, every enumerable set is Diophantine. Its defining 
polynomial can be effectively constructed from the algorithm generating E. 


It seems a priori that there are many more enumerable sets than Diophan- 
tine sets; it is therefore clear that in proving theorem 3.2, one needs to prove 
the Diophantineness of some unexpected sets. J. Robinson discovered that 
this problem can be simplified if one takes for granted the Diophantineness 
of the set {(a,b,c) | a = b°}, and Yu. V. Matiyasevich (cf. [Mat72], [Mat04]) 
established this last step. Below we give some examples and constructions 
used in the proof, which are purely number-theoretical. We first formulate 
the following very general property. 


3.2.3 First properties of Diophantine sets 


Proposition 3.3. The class of Diophantine sets contains the level sets of 
polynomials with integral coefficients, and it is closed with respect to the op- 
erations of finite direct sum, finite intersection, and projection. 


This follows immediately from the definition. It suffices to note that if 
E,F Cc Z™ correspond to polynomials P,Q respectively, then EM F' corre- 
sponds to P? + Q?; EU F corresponds to PQ, and E x F corresponds to 
P? + Q?, where Q is obtained from Q by renumbering the first m variables. 

Now we give the key arithmetical lemma — the proof of the Diophantineness 
of a set related to solutions of Pell’s equation (it is important that for this set 
one coordinate grows approximately as the exponent of the other). 
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Consider Pell’s equation x? — dy? = 1 (d € Z* is a square free integer). Its 
solutions (x,y) € Z? form a cyclic group with respect to the following law of 
composition: if (x1, y1) is a solution with the first coordinate minimal, then 
any other solution is of the type (%p, Yn), where n € Zt and 


tn + ynVd = (#1 + yiVd)”. 


The number n is called the solution number (cf. Part I, §1.2.5). 

The coordinates xy, yn grow exponentially with n, but the set of solutions, 
and its projections on the z— and y- axes are Diophantine. However this is 
still not what we need: the main difficulty is to include the solution number 
into a set of coordinates of a Diophantine set; only then will we be able to use 
further arguments. This is done below. 

It is convenient to use for d the number d = a? — 1, a € Z*, since in this 
case (%1,y1) = (a,1). The equation x? — (a? — 1)y? = 1 will be called the 
a-equation. Define two sequences x,,(a), yn (a) to be the coordinates of its n*® 


solution: 
2n(a) + Yn(a) Va? — 1 = (a+ Va? — 1)”. 


Formal definitions of x,(a) and y,(a) as polynomials in a can easily be given 
by induction over n. Then x(a) and y,(a@) will have sense for all n € Z and 
a € C. In particular, z,(1) = 1, yn(1) = n; in this extended range all of the 
formulae given below will be valid. 


3.2.4 Diophantineness and Pell’s Equation 
Proposition 3.4. The set E : y = Yn(a), a > 1 is Diophantine in the 
(y, n, @) -space. 


The idea in the Diophantine reconstruction of n from (y,a) is based on 
the remark, that the congruence 


Yn =nmod (a-—1) 


determines n uniquely for n < a— 1. In order to treat the general case, an 
auxiliary A-equation is introduced, with big A. Its n*® solution so that n be 
used only in Diophantine context. 

Besides the main variables y,n,a, one introduces six auxiliary variables: 
x, v',y'; A; 21, y1. Furthermore define the following sets: 


Fy:yen, a>; 
FE, : 2? — (a? — 1)y’ = 1; 
E3:y! =O mod 2xy?; 
Ey: 2? — (a? = 1)y?? =1,; 
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Es: A=a+27(x’? —a); 
Eg : xt — (A? — ly = 1; 
Er:y,—y=0mod 2”; 
Eg :y, =nmod 2y. 


The sets E; are all Diophantine, and prE’ = E, where E’ = N$_, Ej. In order 
to check this fact we use the following properties: 


yp(a) =k mod (a—1). (3.2.1) 
Ifa@=bmod cthen yp,(a) = yn(b) mod c. (3.2.2) 


If y;(a) = y;(b) mod 2,(a), a>1 then i=j mod 2n or i= —j mod 2n. 
(3.2.3) 


If y:(a)?|y;(a) then y,(a)|j. (3.2.4) 


Properties (3.2.1) — (3.2.4) are easily deduced from the equalities 


Lntm(@) = Ln(A)%m(a) + (a? — 1) yn (a) Ym(a), 
Yntm = £Ln(4)Ym(@) + Lm(a)yn(a). 


3.2.5 The Graph of the Exponent is Diophantine 


We now prove that the set EF : y = a” in the (y,a,n)-space is Diophantine. 
It suffices to check Diophantineness of Fy = EN {a | a> 1}. Fora >1 one 
easily obtains by induction on n that 


(2a 1)" < yn4i(a) < (2a)” 

in the notation of §3.2.4. From this it follows that 
a” = [yn+1(Na)/Yn+i(N)] 
for sufficiently large NV. To be more precise, Eg is the projection of the set Fy: 
a>; 0< ynti(N)y — yngily)s N > 4n(y + 1); 

and Diophantineness of EF, is then obtained by introducing trivial auxiliary 
relations y/ = Yn4i(N) and y” = Ynii(Na). 
3.2.6 Diophantineness and Binomial coefficients 


Proposition 3.5. The set E: r= eae n > k in the (r,k,n)-space is Dio- 
phantine. 
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3.2.7 Binomial coefficients as remainders 


Lemma 3.6. If u > n* then @) is equal to the remainder of the division 
[((u+1)"/u*] by w. 


The proof follows from the binomial formula 
n - 7 k-1 7 
ny k ss irk ye irk 


The first sum is divisible by u and the second is less than 1 for u > nx. 
The proof of Proposition 3.5 has the same scheme, using the auxiliary 
variables u and v and the relation 


Ey:u>n®; Ey: v= [(ut+1)"/u*); 
E3:r=uvmod u; Ba:r<v; Es:n>k. 


From the lemma it follows immediately that E = prU?_, E;. The Diophantine- 
ness of F£ follows from that of the exponent; the Diophantineness of F3, E4 
and Es is obvious. The Diophantineness of Ey becomes clear if we represent 
E> in the form 

(u+1)" <ubu <(ut+1)"+u* 


and use again the Diophantineness of the exponent. 


3.2.8 Diophantineness of the Factorial 


Proposition 3.7. a) The set E:m =k! is Diophantine. 


b) The set 
E:== et p> 4k, 
y k 


is Diophantine in the space (x,y, p,q, k). 


The proof is a modification of the arguments in §3.2.6, §3.2.7, using the 
following lemma. 


3.2.9 Factorial and Euclidean Division 


Lemma 3.8. a) If k > 0 and n > (2k)*** then 


b) Let a > 0 be an integer such that a = 0(mod (q*k!)) and a > 2P~tpk*1, 
Then 


ee) _ aa "a2*+1(1 & a~?)P/4) - alae fs ar P|: 


The proof of this lemma follows from some elementary computations and 
Proposition 3.7 is proved using the same methods as above. 
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3.2.10 Supplementary Results 


The Diophantine representations stated above are used in the proof of the 
general theorem of Matiyasevich. On the other hand they can also be used to 
find exponential — Diophantine representations for some interesting concrete 
sets. As an example consider the set of prime numbers. By Wilson’s theorem 
p is a prime = > (p— 1)! 4+ 1 is divisible by p. The set of prime numbers 
is therefore a projection of the set of solutions to the following system of 
equations 


p=ft+l 
a a 
q-—ap=1. 


which is Diophantine in view of the Diophantineness of q = f!. 

Any Diophantine subset E Cc Zt coincides with the set of all natural 
values of a polynomial with integral coefficients (on Z). Indeed if E is the 
projection of P(t;7,...%,) = 0 then Q(t;21...%,) = t(1 — P?) is the appro- 
priate polynomial. Thus the set of all primes can be represented as the set 
of all natural values of a polynomial Q. (It should be noted that Q will take 
infinitely many other integral values < 0 which is unavoidable). 

The Fibonacci numbers form the sequence 112358...,Un+t2 = Un+itUn- 
J. Jones found that this sequence can be represented as the set of positive 
values of a very simple polynomial in two variables (this is not the case for 
the set of all primes): 


2a*b + a2b? — 2a?d? — a® — ab* + 2a. 


Although as we have noticed above the question on the provability of any 
theorem can in principle be reduced to a Diophantine equation some concrete 
problems admit natural reductions without the use of a formal language. We 
refer the reader to the very interesting and informative article [DMR74]. In 
particular this article contains Diophantine forms of the Riemann Hypothesis 
and the four—colour problem. 


3.3 Partially Recursive Functions and Enumerable Sets 


3.3.1 Partial Functions and Computable Functions 


In this Section we give a precise definition of a class of partial functions from 
Z™ to Z”. This definition can be considered as an adequate formalization of 
the class of (semi-)computable functions. Using the definition one is able to 
define the class of enumerable sets. We shall denote by D(f) the domain of 
definition of a partial function f, [Rog67], [Man80]. 


3.3.2 The Simple Functions 


suc: Zt > Zt, suc(x) = 24+ 1; 
{2 Zo Z*, WG) 252 py) =1n>0; 


pr; :Z” Zr, PEs gas 58a) =U, 1 2 1. 


3.3.3 Elementary Operations on Partial functions 


(a) Composition (or substitution). This operation takes a pair of partial 
functions f : Z™ — Z” and g: Z” — Z? and gives a partial function 
h=gof:Z™ — Z@?, defined as follows 

D(go f) = f-"'(D(g)) N Df) = {xe Z™ | x € D(f), f(x) € D(g)}, 


(g° f)(«) = g(f(x)) for x € Digo f). 
(b) Junction. This operation takes partial functions f; from Z™ to Z™, i = 
1,...,k to the function (fi,...,f,) from Z™ to Z™ x --- x Z"* defined 
as follows 


D((fis- ++ Fk) = DUA) N+ DF), 
(Faye en te) ony Lm) — (filzi,-+ ytie)se*” i Fel@iy?* ,Xm)). 


(c) Recursion. This operation takes a pair of functions f from Z” to Z* and 
g from Z"*? to Z*, to the function h from Z"*! to Z+ defined as follows 


h(ai,..-,%n,1) = f(v1,.--,%n) (the initial condition); 
h(@1,.--,@n,k +1) = 9(a1,...,0n,k, h(e1,...,2n,k)) fork > 1 
(the recursive step). 


The domain of definition D(h) is also described recursively: 
(@1,..-,;%n,1) € D(h) => (21,...,¢n) € D(f); 


(%1,...,%n,k +1) € D(h) => (a1,...,4%n) € D(f) and 
(@1,-.-,;%n,k, h(a1,...,2n,k)) € D(g) for k > 1. 
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(d) The Operation. This operation takes a partial function f from Z"*+ 


3.3 


to Zt to the partial function h from Z” to Z* which is defined as follows: 


DP iota) Zl a Si I at oe a) 
and (#1,...,2n,k) € D(f) for all k < ap41}, 


A(a1,...,%) = min{@n11 | f(z1,...,2n,2n41) = 1}. 


Generally speaking, the role of pw is to introduce “implicitly defined” func- 
tions. The use of js makes it possible to introduce a one-by—one check of 
objects in order to find a desired object in an infinite family. The following 
three features of js should be stressed immediately. The choice of the min- 
imal y with f(a1,...,2%n,y) = 1 is made, of course, in order to ensure that 
the function h is well defined. Also, the domain of definition of h seems 
at first sight to be artificially diminished: if, say, f(a1,...,@n,2) = 1 and 
f(@1,---,2n, 1) is not defined, we consider h(21,...,2n) to be undefined, 
rather than being equal to 2. The reason for this is the wish to preserve 
the property that hf is intuitively computable. Finally we remark that 
all previously defined operations produce everywhere defined functions if 
applied to everywhere defined functions. This is obviously not the case 
for the operation y. Hence this is the only operation responsible for the 
appearance of partially defined functions. 


.4 Partially Recursive Description of a Function 


Definition 3.9. (a) The sequence of functions fi,--- fn is called a partially 
recursive (resp. primitively recursive) description of a function f = fn if fi 
is one of the simple functions; f; is for alli > 2 either a simple function 
or is obtained by applying the elementary operations to some of the functions 


fis: 


- , fi-1 (resp. one of the elementary operations apart from ). 
(b) The function f is called partially recursive (resp. primitively recursive), 


if it admits a partially recursive (resp. primitively recursive) description. 


Polynomials with positive values. We first establish the recursivity of sums 
and products. 


a) 


b) 


sum; :Z* > Zt, (21,22) 214 22. 


Use recursion over x2 starting from the initial condition x; + 1 = suc(21), 
with the recursive step 4] +k +1 =suc()°>,(«1,k)). 


nm 
sum, :Z" > Z*, (a1,-..,2) 5 Li, n> 3. 
i=l 


Assuming that sum, is recursive we obtain sum, using junctions and 
composition 
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sum, = SuM2 o (suMy—1 0 (pry,-++ ,pry_1),2n). 
Another version is the recursion over x,, starting from the initial condition 
suc osum,,_; and the recursive step 
n-1 
a x,+k+1=suc(sum,(21,...,%p—1,k)). 
i=1 
One finds that the number of recursive descriptions of a function increases 
step-by-step, even if one only counts the “natural” descriptions. 
c) 
prod, :Z? = Zt, (21,22) 2122. 
Use recursion over x2 starting from the initial condition x, with the 
recursive step 71(k + 1) = ak + 21 = sumo(21k, 21). 
d) 
prod, :Z" > Zt, (a1,...,2n)21°+** In, n> 3: 
prod,, = prod, 2 (prod,,_1(pry’, ina DEL 4); In). 
e) “Substraction of one’: Zt — Zt: 


; x—-l1, ifa#>2; 
rege-l= 
1, ifa=1. 


We apply recursion to the simple functions 
f:Z* Zt, f=, 

g = pr3Z? > Z* : (x1, 22,23) + 22, 
and as a result obtain the function h(x1,22) = 22-1. Hence r—1 = ho 
(x,x), where x = pr}(z). 

f) “Truncated difference” 
Y fame ame 
®1— Za, ifv,> 23 


: (3.3.1) 


(x1, £2) = @1—X2 = 


; if x1 <2. 


This “truncated difference” is constructed by applying recursion to the 


functions 
f(1) = z1—1, g(@1, 2, v3) = x3—l. 
Let F : Z" — Zt where F is any polynomial in 21,...,2, with integral 
coefficients taking only values in Z*. If all of the coefficients of f are non— 
negative then F' is a sum of products of functions pr? : (@1,...,%n) > %. 


Otherwise F = F*+ — F~, where F* and F~ have non-negative coefficients, 
and the values of the untruncated difference coincide with the values of the 
truncated one F+—F~ by the assumption on F’. In what follows we use the 
recursivity of the functions (a1 — #2)? +1 and h = (f — g)? +1 where f and 
g are recursive: this trick makes it possible to identify the “coincidence set” 
f =4g with the “level set” h = 1 which is easier to tackle. 
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3.3.5 Other Recursive Functions 
“The Step”: 


sp (2) = 


a, forxr<2z 
‘n Se eh Ds. 


b for x > 20; 
For xo = 1 this is obtained using recursion with the initial condition a and 


the following value b. In the general case 


go? (a) = soa + 1-29). 


rem(x, y) = the remainder in {1,...,x} after dividing y by x (we do not have 
zero!). We have 
rem(z,1)=1: 
1 if S00 
rem(z,yt+1)=4 ° : pain, res 
sucorem(z,y), if rem(2,y) #2. 


We use the following artificial trick. Consider the step s(x) = 2 for x > 2, 
s(1) = 1 and set 

g(x,y) = s((rem(x,y) — x)” +1). 
It is obvious that 


hence 
rem(x, y + 1) = 2suc(rem(x, y))—y(a, y)suc(rem(z, y)). 
This gives us a recursive definition of rem. A generalization of this trick is 
“conditional recursion”: 
h(ai,...,;%n,1) = f(a1,..-,2n); 
h(a@1,---,2n,k +1) = 9:(41,...,2n,k, h(a1,...,2n,k), 
if the condition C;(a@1,...,¢n,k;h) (@=1,...,m) 


is satisfied. (3.3.2) 
We reduce the mutually exclusive conditions Cj(x1,...,%n,k;h) to the form 
C; is satisfied <> y;(01,...,Un, kj; h(ai,...,%n,k)) =1 (3.3.3) 


(an everywhere defined recursive function taking only values 1 and 2.) 
Then the recursive step can be described as follows: 


Wace tysk 1) SOS gi (Ost ary Bay Pye) 4) 


i= 


S 7 (givi)(a1,-- 52m, b, Alar, -- 5 Bn, k))- (3.3.4) 


1 
m 

( 
i=l 
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This trick makes it possible to establish the primitive recursivity of the 
following functions which will be used below. 


The incomplete quotient: 


integral part of y/z, if y/x> 1, 


qt(a, y) a 


1, if y/e <1. 
We have 
qt(x, y), if rem(x, y + 1) =z, yt 1 # x: 
qt(z,y+1)= 4 qt(z,y) +1, ifrem(z,y+1) 42, y+1lF¥a; 
1, y+l=z2. 


One reduces these conditions to the standard form (3.3.3) with the help of 
the functions 
&((rem(z,y +1)— 2x)? +1), 


s((rem(x,y+1)—2)?+1)-8((e-y—1)? +1), 
s((@—y—1)*+1), 


where 


a(1) =1, s(> 2) = 2; 3(1) = 2,8(< 2) = 1. 
The function rad(x) — the integral part of /x. One has 


rad(1) = 1 


rad(a) if qt(rad(x) + 1,2 +1) < rad(x) +1, 


oe a nee +1 if qt(rad(x) + 1,2 +1) = rad(zx) +1. 


These conditions can be reduced to the standard form (3.3.3) in a similar way. 


The function min(z, y): 
min(#,1) = 1; 


i ifa< 
min(z, y + 1) — fea aee uv — Y, 


min(z,y) +1, ifa>y. 


The function max(z, y) (similarly). 
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3.3.6 Further Properties of Recursive Functions 


If f(a1,...,2%n) is recursive then 
SFE iO utea bs Pf S|] Feu) 
k=1 k=1 


are recursive. We can also obtain recursive functions from f in the following 
ways: 


a) by any substitution of the arguments; 

b) by introducing any number of extra arguments; 

c) by identifying the members of any group of arguments (e.g. f(a, 2) instead 
of f(a, y) etc.) 


The map f : Z™ — Z” is recursive if and only if all of its components pr?’ o f 
are recursive. 


Definition 3.10. The set E Cc Z” is called enumerable, if there exists a par- 
tially recursive function f such that E = D(f) (the domain of definition). 


The discussion of §3.1 and §3.2 shows that enumerability has the following 
intuitive meaning: there exists a program which recognizes the elements x 
belonging to E, but which not necessarily recognizes elements which do not 
belong to &. Below, a different description of the enumerable sets will be 
given, which will explain the ethimology of the name: they are the sets with 
the property that all their elements may be obtained (possibly with repetitions 
and in an unknown order) by a “generating program”. 

The following simple fact is easily deduced from the properties of partially 
recursive functions. 


3.3.7 Link with Level Sets 


Proposition 3.11. The following three classes coincide: a) the enumerable 
sets; 

b) the level sets of partially recursive functions; 

c) the 1-level sets of partially recursive functions. 


A much more difficult statement is the following result and its corollaries. 


3.3.8 Link with Projections of Level Sets 


Theorem 3.12. The following two classes coincide: 
a) the enumerable sets; 
b) the projections of level sets of primitively recursive functions. 


Among the primitively recursive functions are the polynomials with coef- 
ficients in Z*. Recall that Diophantine sets are projections of the level sets of 
such polynomials. The Matiyasevich theorem can now be stated precisely as 
follows: 
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3.3.9 Matiyasevich’s Theorem 


Theorem 3.13. The enumerable sets are Diophantine; hence the two classes 
coincide. 


We sketch the proof of Theorem 3.12 in this section, and of Theorem 3.13 
in the next section. 

Let us temporarily call the projections of level-sets of primitively recursive 
functions the primitively enumerable sets. In the first part of the proof of 
Theorem 3.12 it is established that the primitively enumerable sets are all 
enumerable; in the second part the opposite inclusion is proved. 

We therefore let f(a1,..-,%n,Un41,---;n+m) be a primitively recursive 
function, and FE the projection of its 1—level to the first n coordinates. We 
shall explicitly construct a partially recursive function g such that F = D(q). 
This will show that any primitively enumerable set must be enumerable 

We divide the proof into three cases depending on the codimension of the 
projection: m = 0,1 or m > 2. 


Case a): m = 0. Then the set E is the 1—level of f and is enumerable by 
Proposition 3.11. 
Case b): m = 1. Set 


G(@1,---5%n) = Min{ tna | f(@1,---, Ln, en41) = 1}. 


It is clear that g is partially recursive and D(g) = E. 

Case c): m > 2. We shall reduce this to the previous case using the following 
lemma, which is interesting in itself (the lack of a notion of “dimension” in 
“recursive geometry”) and plays an important role in various other ques- 
tions. 


3.3.10 The existence of certain bijections 


Lemma 3.14. For all m > 1 there exists a one-to-one map t(™ :Z+ > Z™ 
such that: 


a) the functions ie = pr’ o¢(™) are primitively recursive for alll <i<m; 
b) the inverse function rm) :Z™ — Z+ is primitively recursive 


Application of the lemma. 
We apply Lemma 3.14 in the case 3.3.9 c) and set for m > 2 


GB eB YD = f(%1,- rear an, th (y),..., te (y)). 


It is clear that g, being a composition of primitively recursive functions is itself 
primitively recursive. It is easy to check that E coincides with the projection of 
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the 1—level set of the function g to the first n coordinates. Since this projection 
is of codimension 1, we have reduced Case c to Case b. 

Proof of the lemma. The case m = 1 is trivial. We shall prove the lemma 
by induction on m, starting from m = 2. 


Construction of t°). We first construct 7'?) : Z? + Z+ by setting 


1 
7) (x1, 22) = 5 ( (21 t £Q)" X41 322 4 2) 


It is easy to check that if we index the pairs (71, 72) € Z? in the “Kantor order”, 
and inside each group with given 2; + #2 in increasing order, then TP) (x4, x2) 
will be exactly the number of the pair (21, 22) in this list. Thus 7°) (a1, a2) is 
bijective and primitively recursive (use (3.3.4) and the recursivity of qt from 
(3.3.5) in order to take into account the 1/2. 

The reconstruction of a pair (1,272) from its image y is an elementary 
task and this leads to the following formula for the inverse function t@): 


ay =9- 5 | you 4-3 (| yoy 5-4] 44). 


2 7 1 
eS (y) = |yf2y- g- 5] -2OW) +2. 
4 2 
Here [z] denotes the integral part of z. Using the results and methods of §3.3.5 
— §3.3.6, one can verify that these functions are primitively recursive. 


Construction of ¢/”), Assume that t°"—)), 7°"— are already constructed, 
and their properties are proved. Set first of all 


r™) — 72D ie™DY igs... | @m—1)) 2m): 
It is clear that 7°”) is primitively recursive and bijective. Solving the equation 
tA(e™-Dig,, sagt) em) =y 
in two steps, we get the following formulae for the inverse function tm); 


te) (y) = #P(y), 8 (y) = MUP), 1<ism-1. 


a a 
By induction, ev”) is primitively recursive. 
This finishes the proof of the lemma and the first part of the proof of 
theorem 3.12. 


The second part of the proof. We now prove that every primitively enumer- 
able set is enumerable. We begin with the following easily verified property of 
the class of primitively enumerable sets. 
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3.3.11 Operations on primitively enumerable sets 


Lemma 3.15. The class of primitively enumerable sets is closed under the 
operations of finite direct sum, finite intersection, finite union and projection. 


Now let FE be an enumerable set. Using proposition in 3.3.7 we realize 
it as the 1—level of a partially recursive function f : Z" — Z*. Note that 
in order to prove that FE is primitively enumerable it suffices to check that 
the graph I'y C Z” x Z* is primitively enumerable. Indeed it is clear that 
FE coincides with the 1—level of the projection onto the first n coordinates of 
the set Ip M(Z" x {1}). Also the set {1} C Z* is primitively enumerable in 
view of the properties listed in §3.3.4, so if we prove that I’, is primitively 
enumerable, then the same would follow for E by lemma 3.15. We have thus 
reduced our problem to that of proving that the graphs of partially recursive 
functions f are primitively enumerable. 

With this purpose we check that: a) the graphs of simple functions are 
primitively enumerable; b) if we are given functions whose graphs are prim- 
itively enumerable, then any function obtained from them using one of the 
elementary operations also has a primitively enumerable graph. 

Stability under recursion and the ys operation are the most delicate points. 
In order to prove these, the following nice lemma is used. 


3.3.12 Gédel’s function 


Lemma 3.16. There exists a primitively recursive function Gd(k,t) (Gédel’s 
function) with the following property: for each N € Z* and for any finite 
sequence a1,...,an € Z* of the lenth N there exists t € Z* such that 
Gd(k,t) = a, for all 1 <k < N (In other words, Gd(k,t) may be regarded 
as a sequence of functions of the argument k indexed by the parameter t such 
that any function of k on an arbitrarily large interval 1,...,N can be imitated 
by an appropriate term of this sequence). 


In order to prove this it is convenient to put first 
gd(u, k,t) = rem(1 + kt, u) 


and to show that gd has the same property as Gd if we allow ourselves to 
choose (u,t) € Z?. After this we could put 


Gd(k,y) = gd(tt” (y), & tP(y)), 


where t?) : Z+ — Z? is the isomorphism of Lemma 3.14. Getting rid of the 
auxiliary parameter u in Gd(k,t) (in comparison with gd(u,k,t)) causes no 
essential problems. 
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3.3.13 Discussion of the Properties of Enumerable Sets 


Theorem 3.12 of §3.3.8 shows that if F is enumerable, then there exists a 
program “generating” F (cf. §3.3.6). Indeed, let E be the projection onto 
the first n coordinates of the 1—level of a primitively recursive function 
f(a1,..-,2n,y). The program “generating” F should check one—by-one the 
vectors (X1,-.-,2n,Y), say, in Kantor’s order; it should compute f and output 
(@1,..-,%n) if and only if f(a1,...,2n,y) = 1. Since f is primitively recur- 
sive, the generating program will sooner or later write down each element of 
FE, and no other element. It cannot stop forever on elements not belonging to 
E.. However, if E were empty we could never find this out just by waiting. 

The set E Cc Z” is called solvable, if it and its complement are enumer- 
able. Intuitively this means that that there is a program which decides for any 
element of Z” whether it belongs to F or not. These sets can be character- 
ized as being the level sets of general recursive (everywhere defined recursive) 
functions, or as the sets whose characteristic function is recursive. In order to 
establish these properties, the following result is used. 


Proposition 3.17. A partial function g from Z” to Z* is partially recursive 
iff its graph is enumerable. 


3.4 Diophantineness of a Set and algorithmic 
Undecidability 


3.4.1 Algorithmic undecidability and unsolvability 


Before explaining how to prove that the classes of Diophantine and of ennu- 
merable sets are the same, we first give some interesting applications of this 
theorem. It is known from logic that there are sets which are ennumerable 
but not solvable. Combining this fact with Matiyasevich’s theorem (see Theo- 
rem 3.13) and the Church thesis, we deduce that Hilbert’s tenth problem (see 
section 3.1.2) is undecidable, see [Mat04]. 

First of all, every natural number is a sum of four squares (Lagrange’s the- 
orem, see Part I, section 1.2.6). The solvability of the equation f(a1,...,2n) = 
0 in Z” is therefore equivalent to the solvability of the equation 


4 4 
(143 vhs t + Sox) =0 
i=1 i=l 


in Z*”. It is thus sufficient to establish the algorithmic undecidability of the 
class of questions whether equations have solutions in Z”. Let E C Zt be 
ennumerable but not solvable. We represent it as the O-level set of a poly- 
nomial f; = f(t;21,--: ,@n) = 0, f © Z[t;v1,--- ,v,]. The equation f,, = 0; 
to € Z* is solvable iff tg) € E. According to a general principle (the Church 
thesis), intuitive computability is equivalent to partial recursivity of a func- 
tion. This implies that the corresponding class of problems for the family { f;} 
is algorithmically decidable, iff the characteristic function of F is computable. 
However this is not the case by the choice of E: although E is ennumerable, 
its complement is not. 

Thus the question of solvability in integers is undecidable even for an ap- 
propriate one parameter family of equations. The number of variables, or 
more generally the codimension of the projection can be reduced to 9 (Yu. I. 
Matiyasevich). The precise minimum is still unknown, although this is a very 
intriguing problem. 


3.4.2 Sketch Proof of the Matiyasevich Theorem 


One introduces temporarily a class of sets, intermediate between the ennu- 
merable and Diophantine sets. In order to define this class, consider the map 
which takes a subset EF C Z” to a new subset F' C Z” defined by the following 
law: 

(U1,.--,%n) © F = Vk € [1, 2p] 


(21, Meg lai k) EE. 


We shall say in this case that F is obtained from E by use of the restricted 
generality quantor on the n*" coordinate. The restricted generality quantor is 
defined analogously on any other coordinate. 
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Definition—Lemma. Consider the following three classes of subsets of Z” 
for any n: 


I. Projections of the level sets of primitively recursive functions. 

II. The smallest class of sets containing the level sets of polynomials with 
integral coefficients, which is closed under the operations of finite direct 
sum, finite union, finite intersection, projection and the restricted gener- 
ality quantor. 

III. Projections of the level sets of polynomials with integral coefficients. 


Then 


a) Class I coincides with the class of ennumerable sets, and Class III with 
the class of Diophantine sets. The sets of the class IT will be called D-sets. 
b) The following inclusions hold: I> II D III. 


The final steps in the proof of the Matiyasevich theorem consist of reduc- 
tions similar to those described above. The crucial part is the proof that the 
class of Diophantine sets is closed under the use of the restricted generality 
quantor. Here one makes use of the Diophantine representations of concrete 
sets from §3.2, in order to check that application of Gédel’s function does not 
damage the Diophantineness. 


Note that B. Poonen studied in [Po03] Hilbert’s tenth problem for large 
subrings of Q in connection with Mazur’s conjecture on varieties over Q whose 
real topological closure of rational points has infinitely many components (no 
such varieties are known to this point). For the field of rational numbers 
Hilbert’s tenth problem is a major open question. In trying to answer it two 
general methods have been used: one is to study the similar question in other 
global fields (such as fields of rational functions F,(t) over finite fields) and 
try to transfer the methods to Q; the other is to try to prove it for ever 
larger subrings of Q. Relations of this problem with arithmetic and algebraic 
geometry were studied in [DLPvG], see also [Sh103]. 
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Arithmetic of algebraic numbers 


4.1 Algebraic Numbers: Their Realizations and 
Geometry 


4.1.1 Adjoining Roots of Polynomials 


The idea to extend the field of rational numbers owes a lot to various attempts 
to solve some concrete Diophantine equations. The use of irrational numbers 
which are roots of polynomials with rational coefficients often makes it possible 
to reduce such equations to more convenient forms. An intriguing example of 


this is the study of the Fermat equation (cf. [BS85], [Pos78], [Edw77], [Rib79]): 
x+y" =2" (n> 2). (4.1.1) 


The unsolvability of (4.1.1) in non-zero integers for n > 2 is now established in 
the work of Wiles [Wi] and Wiles-Taylor [Ta-Wi] on the Shimura-Taniyama- 
Weil conjecture and Fermat’s Last Theorem, see chapter 7. Wiles used vari- 
ous sophisticated techniques and ideas due to himself and a number of other 
mathematicians (K.Ribet, G.Frey, Y.Hellegouarch, JM.Fontaine, B.Mazur, 
H.Hida, J.—P.Serre, J.Tunnell, ...). This genuinely historic event concludes a 
whole epoque in number theory. 

Notice that before the work of Wiles, it was known from results due to 
Faltings (see chapter 5, §5.5) that the number of primitive solutions (i.e. such 
that GCD(a, y, z) = 1) is finite for each n > 2. If n is an odd integer then the 
left hand side transforms into the following product: 


n—-1 


[[@+¢%) =2, (4.1.2) 


k=0 


where ¢ = exp(27i/n) is a primitive n*® root of unity. If we suppose that 


the ring R = Z[¢] has unique factorization of elements, then by studying the 
divisibility properties of the left hand side of (4.1.1) one can prove that (4.1.1) 
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has no solutions in integers not dividing n (this is the first case of the Fermat 
conjecture: n { yz) (Kummer). However, this unique factorization property is 
far from being always satisfied: J.M.Masley and H.L.Montgomery (cf. [MM76]) 
have found all n with this property; it turns out that there are altogether 29 
such numbers, and the primes among them are n = 3, 5,7, 11,13, 17,19. Notice 
that before the work of Wiles, the validity of the first case of Fermat’s Last 
theorem has been established for infinitely many primes ([AdHB85], [Fou85], 
(GM). 


Let a@ be a complex root of an irreducible polynomial f(z) = 2” + 
Qn—10"- 1 +...a12 + ao € Qa] with rational coefficients a; € Q. If k = Q(a) 
is the smallest field containing a then each of its elements @ has the form: 
3 =r(a), where r(x) € Q[z] is a polynomial of degree deg r(a) <n, and the 
arithmetical operations in Q(a) are the same as those with residues mod f 
in the ring of polynomials Q{z]. 

In other words, there is an isomorphism between & and the quotient 
ring Q|x]/(f), and & is an n-dimensional vector space over Q (with basis 
l,a,...,a@”~'). A choice of basis gives another realization of elements of k as 
n X mn square matrices: to an element (@ one attaches the matrix of the linear 
transformation yg : «+ Gx (with respect to the chosen basis). For the basis 
{1,a,...,a@”~'} the endomorphism ya, is described by the matrix (sometimes 
called the adjoint matrix): 


0 0 . 0 —ao 

1 UO ca! Gy 
Av=|0 1 Oy east ts 

0 QO... 1 —An-1 


and the smallest subring of the matrix algebra M,,(Q) containing A, can be 
identified with k. Each element ( € k is a root of the characteristic polynomial 
of the endomorphism yg, and its determinant and trace are denoted N@ and 
Tr@. These are called the norm and the trace of @. The bilinear form B : 
kx k > Q defined by B(u,v) = Tr(uv) is non-degenerate. An element (3 is 
called integral if all of the coefficients b; of its characteristic polynomial 


det(X - 1, — yg) = X" + bpn_ 1X" 1 4--+ +b € Q[X] 


are integers. This condition is equivalent to saying that the ring Z[G] is a 
finitely generated Abelian group. The set of all integers of k will be denoted 
by O = O,. This is a free Z-module (a free Abelian group) with a basis 
W1,°++ ,Wy. The determinant of the bilinear form B(u,v) with respect to such 
a basis is called the discriminant of k, and is denoted by D = Dy. This is 
independent of the choice of basis of Ox. 

The idea of symbolically manipulating the roots of polynomials has lead to 
the theory of algebraic extensions of arbitrary fields, for which one may repeat 
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the above constructions. If k C K are two fields and the dimension [K : k] 
is finite, then for any @ € K one defines analogously Nx/;,(3) and Trx/,((). 
The claim that the form B(u,v) = Trx/,(uv) is non-degenerate is one of the 
definitions of a separable extension. If this is the case one can always find an 
element y € K such that K = k(y) (this statement is known as the primitive 
element theorem) (cf. [La65], [Sha87]). 

Adjoining the roots of all the irreducible polynomials in kLX] to the ground 
field k leads to the construction of an algebraic closure k of k. This is a 
field, uniquely defined by k upto isomorphism, which consists of elements 
algebraic over k, and which is also algebraically closed. This means that every 
polynomial f(X) € k[K] with deg f > 0 has a root a € k. When we write Q 
we often mean the complex realization of this field as the set of all complex 
numbers a € C which are roots of polynomials with rational coefficients. 


4.1.2 Galois Extensions and Frobenius Elements 


(cf. [La65], [LN83]). In general let K’/k be a finite separable extension, k C 
K ck. Then K/k is called a Galois extension if for every embedding X : 
K — k over k (i.e. A(x) = x for x € k) one has \(K) = K. In this case the 
automorphisms \ : Kk — K over k form a group G(A/k) = Aut(A/k) of order 
n which is called the Galois group. In what follows the action of 0 € G(K/k) 
on x € K will be denoted either by x”, or by o(x) so that the composition 
law is (tTo)(x) = T(o(x)), «77 = (#7) (a left action of G(K/k) on K). 


Theorem 4.1 (Main Theorem of Galois Theory). There is a one-to-— 
one correspondence between subgroups H C G(K/k) and intermediate fields 
Lwithk CL CK. This correspondence is defined by the following law: 


Hw K# ={r eK | x? =2 for allo € H}, 
Lew Ay = {0 € G(K/k) | «? = 2 for all x € L}. 


The normal subgroups H <G(K/k) correspond exactly to the Galois subex- 
tensions L/k, and for such subgroups or extensions we have G(L/k) = 


G(K/k)/Hy. 


Example 4.2 (Finite Fields.). Let K = Fy be a finite field with q¢ elements. 
Then g = pf and F, is a vector space of dimension f over the prime subfield 
F, = Z/pZ (cf. [LN83]). For any integer r > 0 the algebraic closure F, contains 
exactly one extension of F, of degree r: 


Fy = {a €F, | 2? =2}, 


so that «?°~! = 1 for all elements of the multiplicative group Fj. The exten- 
sion F,r/F, is therefore a Galois extension, and its Galois group is cyclic of 
order r: 
2 -1 
G(F qr /Fq) = (1, Frq, Frq, Er, iz 
where Fr,(x) = x7 is the Frobenius automorphism. 
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Example 4.3 (Cyclotomic Fields.). Let Gm be a primitive root of unity of 
degree m. Then the field Ky, = Q(¢) contains all roots of the polynomial 
X™—1=[[">'(X—¢,) and is therefore a Galois extension. If ¢ € G(Km/Q) 
then the element o¢,, must also be a primitive m*™ root of unity, so that 
o6m = C4, for some a with (a,m) = 1. If ¢* is another m* root of unity then 
o(¢k) = ¢¢*. Hence the correspondence ¢ ++ a(mod m) produces a canonical 
map G(Km/Q) — (Z/mZ)”* which is in fact an isomorphism. In order to 
prove this it suffices to show that the cyclotomic polynomial 


6,(X)= T] (x-d) 


i=1 
(iym)=1 


is irreducible over Q. First we see that X™ —1 = |], Pa(X), and hence 
@,,(X) = Taser’? — 1) © ZX] (where p(d) is the Mébius function 
of d). The irreducibility is established by reducing the polynomials modulo 
p: Z[X] — F,[X]: f(X) > f(X) € F,[X]. One applies the properties of the 
Frobenius endomorphism f(X) — f(X)? = f(X?) € F,[X] in the ring F,,[X]. 
Suppose that @,,(X) is not irreducible and let 


be the decomposition of &,,, as a product of irreducible polynomials in Z[X]. 
We show that for all a mod m with (a,m) = 1, fi(Gm) = 0 implies f(¢#,) = 0. 
We use the existence of a prime p such that p= amod m. The polynomial 
X™ — 1 is coprime to its derivative mX™~! in F,[X] since p /m. Hence the 
polynomials f,(X),..., f,(X) are pairwise coprime. 

If fi(G*,) # 0 then we have f;(¢",) = 0 for some 7 # 1, which implies 
f;(¢%,) = 0. Hence f;(X) has a common factor with f;(X”). In fact since f; is 
irreducible, it must divide f;(X?). Therefore f,(X) divides f;(X?) = f;(X)?. 
This contradicts the fact that f;(X) and f;(X) are coprime. 


Note that we do not need to assume the existence of a p such that p = 
amod m. We could instead consider the decomposition a = pf! -:--- pes and 
study all the reductions mod p;, i = 1,...,s (cf. [BS85], [La65], [Chev40], 
[La78b], [Wash82]). 

Recall that a Dirichlet character x modulo m is a homomorphism y : 
(Z/mZ)* — C*. These are often regarded as a functions on Z such that 
x(x) = x(a mod m) if (v,m) = 1, and x(x) = 0 if (x,m) > 1 (see Part I, 
§2.2.2). According to what we have proved, there is a canonical isomorphism 
G(Km/Q) = (Z/mZ)*. Hence a Dirichlet character defines a homomorphism 
px : G(Q/Q) — C* by means of the projection G(Q/Q) > G(Km/Q). 


Theorem 4.4 (Theorem of Kronecker—Weber). For any homomorphism 
p: G(Q/Q) — C% of finite order there exists a Dirichlet character x such that 


P= Px 
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(cf. [Sha51], [AT51], [Chev40]) . 

The theorem of Kronecker—Weber can be restated as saying that any Galois 
extension K’/Q whose Galois group G(A’/Q) is commutative (i.e. any Abelian 
extension) is contained in a cyclotomic extension. 

A remarkable fact is that the elements of the Galois group G(K»m/Q) 
correspond to prime numbers (more precisely pmod m for p{m). The deep- 
est results of algebraic number theory are related to generalizations of the 
Kronecker-Weber theorem. For example, Deligne and Serre have shown that 
there exists a correspondence between two-dimensional irreducible complex 
representations p : G(Q/Q) — GL2(C) such that detp = p, for an odd charac- 
ter x, and primitive cusp forms of weight one (cf. Chap. 6, §6.4, the theorem of 
Deligne—Serre). It is conjectured that this correspondence is one-to-one, and 
therefore gives a two-dimensional analogue of the Kronecker—Weber theorem. 


4.1.3 Tensor Products of Fields and Geometric Realizations of 
Algebraic Numbers 


In order to obtain a convenient geometric realization of an algebraic num- 
ber field k, we use the tensor product k ® R. Constructions involving tensor 
products of fields are frequently used in algebraic number theory, and for this 
reason we begin with a general result on these products. 


Theorem 4.5 (Theorem on Tensor Products of Fields). Let K/k be a 
finite separable extension, K = k(y), and let L/k be another extension, and 
suppose that 


K = RX/(F)(X)),  fy(X) = [La 


is the decomposition as a product of irreducible polynomials in the ring L[X]. 
Then there is a ring isomorphism 


K@,L& TW & 
ab 


where L; = L[|X]/(g;(X)) are finite extensions of L containing K under the 
embeddings \,: K — L;, defined by 
Ai(r()) = r(X) mod gi(X). 


(cf. [CF67], [Chev40]). 

The proof of this theorem is similar to that of the Chinese remainder 
theorem. The elements r(y) @, ! with | € L, r(X) € k[X] generate the whole 
ring K ®, L, and the isomorphism is given by 


r(y) @p be (Ir(X) mod gi(X),--- ,lr(X) mod gm(X)). 
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Corollary 4.6. Let 8 ¢ K. If fg(X) € kX] is its characteristic polynomial 
in the extension K/k and fg;(X) € LX] are its characteristic polynomials 
in the extensions L;,/L, then 


fa(X) =| fo.- 
t=1 


In particular, we have 


Nxx(B) = []N220(9)), (4.1.3) 
Trxe(8) = 2 Trz,/t(Ai(8)). (4.1.4) 


If we take k for L, then m = n and )q,..., An are all possible embeddings 
NIK Ok, 
Sig(X) = (X — Ar (B)) + +++ + (X — An(9)). 
Hence for any @ € K we have 


n n 


Nixx(8) = [] (8), Trxn(8) = 5 (8). (4.1.5) 


i=l i=l 


By putting L = R, K = Q(y) and k = Q, we obtain a geometric realization 
of the algebraic numbers. Let f,(X) = (X —7y1)----- (X = p,)-(X? +0, X + 
i) ee (X?+a,,X + 3,,) be the decomposition of the minimal polynomial 
fy(X) € Q[X] of ¥ into irreducible polynomials over R. Then 


K @, R= Q(7) @RZR" x C” (4.1.6) 


(this is an R-algebra isomorphism), or Q(y) ® R = R” as a real vector space, 
so that n = 11 + 2re. Let A1,--+ yAry,+ ++ yAry4tr. be the embeddings of 4.1.2. 
Then the tuple 

NOG ig ga) 


defines an embedding of K into R”, and any embedding of K into C is one of 
the following 


Mi, aed Arp) Ary +13 Ary 41s ea >» ARLhes Arie: 


A lattice M in a vector space R” is by definition a discrete subgroup 
M c R” such that the quotient group R”/M is compact (in the natural 
topology). Every lattice is a free Abelian group generated by a basis e1,..., €n 
of R”. 
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If O is the ring of integers in K, then one verifies that its image M = 
A(O) C R” is a lattice, and 


Dx = (—4)"2vol(R"/A(O)), (4.1.7) 


where Dx is the discriminant of K, and vol(R"/A(Q)) is the volume of 
the fundamental parallelogram {}>;"_, zie; | 0 < aj < 1} of the lattice O = 
(€1,-.-,€n) with respect to the usual Lebesgue measure on R”. 

For example, let K = Q(a) be a quadratic field, where a? = d for some 
square free integer d. Then a calculation of the characteristic polynomial of a 
typical element 6 = a+ ba (where a and b are rational numbers) shows that 
O = OK = Z|w| where 


_ ita 
2, 


Ww 


and Dx =d for d=1(mod 4), 


w=a and Dx =4d for d= 2,3(mod 4). 


If d is positive then the geometric realization of the number @ = a+ ba will 
be the point \(3) = (a+bVd,a—bvV 4d). In the case of an imaginary quadratic 
field (d < 0) the geometric realization of the number 3 = a+ ba will be 
the point (a + iby/|d]) in the complex plane. Since Z[w] = (1,w) we have for 
positive d 

Vd ifd=1mod 4, 


and for negative d 


d . 
vol?(C/Z[w]) = vial if |\d| = 3 mod 4, 
\/|d| if |d| = 1,2 mod 4. 


Figures 4.1 and 4.2 illustrate the lattices of integers in the quadratic fields 


Q(V=1) and Q(v2). 


4.1.4 Units, the Logarithmic Map, and the Regulator 


In the ring Z there are only two invertible elements (units): 1 and —1. The 
group of units, i.e. invertible elements of the ring of integers Ox of a number 
field K has a less trivial structure. However, this group can be completely 
described. One uses the notation Ex = Of. 

Some interesting arithmetical problems can be reduced to finding elements 
of Ex. For example, consider Pell’s equation (see Part I, section 1.2.5) 


zg? — dy? =1 (4.1.8) 


(where d is a square free positive integer). 
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d=-4 d=2 
z(iJec zLv2]eR? 
D=-4, voe=1 D=6, vol=2v2 


Fig. 4.1. 
Fig. 4.2. 


Note that if @ € Of then NG and NG~! = N(G"1) are rational integers, 
hence NG = +1. Conversely, any solution to (4.1.8) in integers x, y produces 
a unit G = «+ ya in the =e quadratic field k = Q(a), a? = d since NG = 
(a + yVd)(a — yd) = 2? — dy?. On the other hand for all 6 € Ox with 
NG = +1 we have that 6 € OX. It follows from a general theorem of Dirichlet 
on the structure of Ex for an algebraic number field K (the Dirichlet unit 
theorem), that for K = Q(Vd) one has Ex = {te" | n € Z}. Here « is 
a fundamental unit (which can be uniquely defined by the condition that 
Ai (e) = a + bVd is minimal with \,(e) > 1). The set of solutions to (4.1.8) 
can be identified with a subgroup of Ex of the form {<9 | n € Z}, where 
€0 = 20 + yoV/a corresponds to the minimal solution (eo) = ao + yoWd > 1. 

In order to describe the structure of Ex in the general case, one uses the 
embedding A: kK — K @®R = R" x C” and the following logarithmic map 
1: (R™ x C™)* — R™+"2 where for i < ry by definition 1;(x) = log|z\, 
l; : RX +R, and for i > r U(x) = log|a|?, ; : CX — R. Under the map 
lo, multiplication in K becomes addition in R™*". If « € K then in view 
of (4.1.3) we know that 


Nz = 1 (x) AE Ary (x) Ap. 41(£) Ar, 41(2) Seb Aritre (©)Ary+r2(Z)- 


Hence 


ritre 


De 1,(Ai(x)) = log |Na]. 


In particular, the image |\(OX) of OF lies in the hyperplane 


ritre 


2 anol VER’, r=rm+re-1. 


The kernel of the map |: (K @R)* — R"*?? is the following compact set 


V= {1 trtn) € € Rt 
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{41}™ x 87 CR™ x C™ 2 R", 


where S = {z € C | |z| = 1} is the unit circle. We see that the logarithmic 
map provides an effective way of drawing the units: the kernel of 1A : Ex — 
R"+"2 consists of only a finite number of elements (the roots of unity in K). 
Dirichlet’s theorem says that the image /\(E x) is a complete lattice in V & R” 
(where r = r1 +1r2—1). In other words, one can find elements €),...,¢, € Ex 
such that any unit ¢ € Ex can be uniquely represented in the form 


€= ne" vee egtr 
where n; € Z and 7, is a root of unity in K. In particular, ¢),...,¢, € Ex are 


multiplicatively independent: 
IX(E1),.--, lA(Er) 


form a basis of the hyperplane V. Consider now the volume vol(V/1A(Ex)) 
of a fundamental parallelogram for the lattice of units (with respect to the 
measure on V induced by Lebesgue measure on R”). The number Rx = 
vol(V/IA(Ex))/Vr + Lis called the regulator of K and is equal to the absolute 
value of the determinant 


WyAr(er) — laAa(E1) +++ Ue treAratre(€1) 


lAi(Er) InA2(Er) aed bey tre Ary tre (Er) ; 
(ry +12)7} (ry +12)! see (r1 +72)7} 


4.1.5 Lattice Points in a Convex Body 


We now describe a general geometric idea, on which the proof of the Dirichlet’s 
theorem, and some other interesting facts (such as bounds for discriminants 
and class numbers) is based. 


Theorem 4.7 (Minkowski’s Lemma on a Convex Body). Let M be a 
lattice in R", A = vol(R"/M), and let X C R” be a centrally-symmetric 
convex body of finite volume v = vol(X). If v > 2"A, then there exists 0 A 
aeMnx. 


Proof. In order to prove the lemma, it is convenient to consider the lattice 
2M Cc R” whose fundamental parallelotope has volume vol(R"/2M) = 2"A. 
Then under the natural projection of X C R” onto a fundamental parallelop- 
iped R"/2M there will be overlaps in the image of X, because the volume 
of X is bigger than the volume of a fundamental parallelepiped. Hence there 
exist two different points z1, 22 € X, 21 # 22 such that z, = zg mod 2M, i.e. 
(z1 — 22) /2 € M. The proof follows: the point (z1 — z2)/2 4 0 belongs to X in 
view of its convexity and central symmetry, since (21 — z2)/2 = (21 +(—2z2))/2 
(if z € X then —z € X). 
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Here are some examples of convex bodies to which we can apply Minkowski’s 
lemma. Let 2° = (29,...,29,,,,) € KOR, [N(w®)| = TE; le TT, (2, 451° 
#0. Put 
W(x) ={xE K @R | |zx;| < |x?|, i=1,...,71 +12}. 
For a positive integer a we put 


TL r2 
Doll +20 lensil <a 


i=1 j=l 


U(a)=(cxEK@R 


A calculation of these volumes shows that 


vol(W(x°)) = 2"m"2|N(x°)|, vol(U(a)) = 2” (5) 
Applying Minkowski’s lemma to the lattice M = (Ox) and these bodies 


(where A = 2~"?,/|Dx| by (4.1.7)), we see that 


(4.1.9) 


a) for arbitrary constants ¢; > 0 (i= 1,...,71 + 1r2) satisfying the condition 
Tt T2 T2 
2 
Il< Cats > (2) \/|Dx| there exists a non zero element a € Ox 
T 
i=1 j=l 


such that 
|As(a)| <q; (@@=1,...,7r, +12); (4.1.10) 
it suffices to take z° € K @R with |xz®| = cq (i = 1,...,r1 + r2) and 
a € W(2°); 
b) for a > (n! (4)” VD) -e there exists 8 € Ox, 6 £0 from U(a), such 
that 


S > 1Ai(8)| +252 [Ari 45(8)| < a, 
i=l j=l 


hence in view of the inequality between the arithmetic and geometric 
means we have the estimate 


a < (4) SVPxl WINGED. aay 


From (4.1.11) follows the estimate for the discriminant: 
2n 


T\2r2 0 wera TD 2n—0/6n 
\Dxl> (4) az = (Ga) anne (<0 <1) 


showing that |Dx| grows with n. 


Some other remarkable consequences of Minkowski’s lemma are: 
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Theorem 4.8 (Hermite’s Theorem, (1863)). There are only finitely many 
algebraic number fields with a given discriminant. 


Theorem 4.9 (Minkowski’s Theorem, (1890)). If K 4 Q then |Dx| > 1. 


For the proofs of these theorems cf. [Wei74al. 

From the above estimate for the discriminant it follows also that for large 
n one has |Dx|!/" > (7.3)"/"(5.9)"2/". However nowadays much stronger 
estimates for discriminants are known: |Dx|!/" > (188)"!/"(41)"2/” (for large 
n), cf. [Od175] , [Kuz84]. The latter are deduced from analytic properties of 
the Dedekind zeta-function via explicit formulae (cf. §6.2.3 and §6.2.5). 


4.1.6 Deduction of Dirichlet’s Theorem From Minkowski’s Lemma 


Consider the hypersurface T, = {% € K ®R||Nz|=c} for a fixed c > 0. 
Under the logarithmic map this becomes the affine hyperplane 


ritre 
Vieee= {ver y n= toe}. 
i=1 


The group of units Ex acts on T, by multiplication with A(e), « € Ex. Under 
the logarithmic map the action of ¢ becomes a translation by the vector lX(e), 
which maps Vjog¢ into itself. The number of orbits of this action on T.NA(O x) 
is finite for any fixed c. Indeed it suffices to show that if N(a) = N(8)=ceEZ 
and a = 3(mod c) in the ring Ox then a/( € Ex. In order to see this, notice 


that a divides its norm Na = c. Hence the number B =14+ fea belongs to 


Ox. Similarly, a € Ox, hence B € Ex = Of. 

We now use the results of §4.1.5, and choose some c > (2)” /|Dx|- 
Then for any element x € JT, one can find an element a € Ox such that 
(a) € W(x). We use this fact to show that the quotient group V/I\(Ex) 
is compact. It suffices to show that V = Vo can be covered by translations 
of a bounded set by vectors IX(e), ¢ € Ex. In turn, this is implied by the 
analogous statement for any hyperplane parallel to V, for example of the type 
Voge instead of Vo. For any a € Ox, a # 0, consider the set Y.(a@) C Voge 
consisting of all y = I(x) € Vioge such that A(a) € W(x). Then Y.(q) are all 
bounded, Y.(ae) = Y-(a) + UX(e) for e € Ex, and Minkowski’s lemma implies 
that any y € Vioge is contained in some Y,(a@). On the other hand, we know 
that there are only finitely many classes of a € Ox with |N(a)| < c modulo 
the action of Ex. If {a;} is a finite system of representatives of these classes, 
then the desired compact set can be defined to be the union UY,(a;). This 
proves the compactness statement; discreteness is implied by the analogous 
fact for the lattice \(Ox), and the fact that the logarithmic map restricted 
to any hypersurface T, is a surjective open map onto Viog c. 


4.2 Decomposition of Prime Ideals, Dedekind Domains, 
and Valuations 


4.2.1 Prime Ideals and the Unique Factorization Property 


The original purpose of Dedekind’s theory of ideals was to extend the results 
of Kummer on Fermat’s theorem to a larger class of exponents. Let R be a 
commutative ring with unity. An ideal a of R is by definition an additive 
subgroup a C R such that Ra C a. An ideal a ¥ R is called prime iff ab € a 
implies a € aor b € a (ie. the factor ring R/a has no zero—divisors). An ideal 
of the type a = (a) = Ra for a € R is called a principal ideal. The notation 
(ai)iex denotes the smallest ideal containing all a; € R, (i € I). An element 
a € Ris called prime iff 7 = ab implies that either a or 0 is invertible (i.e. a 
unit) in R. The reason for the lack of uniqueness of factorization into prime 
elements in R, is related to the fact that the ideal (7) generated by a prime 
element 7 is not always prime. 


Example 4.10. Let R = Z[,/—5] then there are two essentially different fac- 
torizations into prime elements: 


21=3-7=(1+2V/—5)- (1-2-5). 


A simple check shows that none of the divisors of two different factors in this 
identity belong to R. However, the uniqueness of factorization can be restored 
if we pass from prime elements to prime ideals. Indeed, the following ideals 


are prime: 
Pi = (3, V—5 — 1), po = (3, V—5 — 2), 
ps = (7, 7-5 —3), pa= (7,V—5—4). 
This is implied by the decompositions: 
X?45=(X—1)(X —2)(mod 3), X*+5=(X —3)(X —4)(mod 7), 
for example, 
R/p1 = Z[X]/(3,X — 1, X* +5) = Fs[X]/(X — 1) & Fs, 


in view of the identity (X —1,X?+5) = X —1 in F3[X]. Analogously one 
proves the decompositions 


(3) =pi-pe, (7) =p3-pa, (1+2V—5) =pi-ps, (1-2V—5) = po- pa, 


and the factorization (21) = pipep3p.4 is the unique decomposition as a prod- 
uct of four ideals. The ideals (3), (7), (1+2V—5), (1 — 2/—5) are not prime. 
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A Dedekind domain is by definition a commutative associative ring with 
identity, in which the factorization of non-zero ideals into prime ideals is 
unique. This is equivalent to R being a Noetherian (every ideal being finitely 
generated), integrally closed (containing every element of its field of fractions 
which is integral over R) ring, all of whose non-zero prime ideals are maximal 
(i.e. R/p is a field). 

One can prove that the ring Z[,/—5] of our example is a Dedekind do- 
main. From the given characterization it follows that for a given number field 
K, [K : Q| < ™, the ring of integers Ox is a Dedekind domain. It also fol- 
lows that no proper subring of Ox with the same field of fractions can be 
a Dedekind ring, since it cannot be integrally closed. For example, the ring 
Z|V5] is not a Dedekind ring: the ideal (1 — V5) cannot be decomposed into a 
product of prime ideals. However the bigger ring Z[A4| = Ox, K = Qv5) 
is a Dedekind ring. Thus one can build a good divisibility theory in this class 
of rings by replacing elements a by the corresponding ideals and using prime 
ideals rather than prime elements. However, the class of Dedekind rings is 
quite narrow, and a good divisibility theory can be built in a much larger class 
of rings. For example, in the polynomial ring k[a1,x2,...,%n] over a field k 
one has unique factorization of elements, and the prime elements here are the 
irreducible polynomials. On the other hand, the existence and uniqueness of 
factorization of ideals into prime ideals does not hold in this ring. For instance, 
the ideal (x?, y) C k[x, y] does not have such a decomposition. This last exam- 
ple explains particularly Kronecker’s mistrust of the prime ideals of Dedekind. 
Kronecker himself began developing a different theory of divisibility, based on 
valuations. This is described below (§4.2.5 and §4.3). The history of the con- 
troversy between Kronecker and Dedekind is nicely presented by H.Wey] (cf. 
[Wey40]). 

Fractional ideals. Let Ox be the ring of all integers in a number field K, 
[Kk : Q| < w. A fractional ideal is by definition a non-zero Ox-submodule 
a C K such that aa C Ox for some a € K*. The properties of Dedekind 
domains imply that together with a fractional ideal a, the Ox-submodule 
at = {x € K | xa C Ox} will also be a fractional ideal. If a and @ are 
fractional ideals, then a@(@ is also a fractional ideal. Thus the fractional ideals 
form a multiplicative group Ix whose identity element is Ox. Since Ox is 
a Dedekind domain, it follows that Ix is a free Abelian group in which the 
prime ideals p C Ox form a basis: every a € Ix can be uniquely written in 
the form: 


= pyres: p(n; € Z). 


The norm Na of an integral ideal a C Ox is defined to be the number 
of elements of the corresponding factor ring: Na = Card(Ox/qa), and the 
norm of an arbitrary fractional ideal a € Ix is defined by multiplicativity. If 
a = (qa) is a principal ideal, then N((a)) = |Na| = |Nxqa|: multiplication 
by @ defines an endomorphism of the lattice Ox, and one easily verifies that 
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the absolute value of its determinant coincides with the index of its image: 


(Ox : (a) = N((a)). 


4.2.2 Finiteness of the Class Number 


To each element a € K™ one can associate (a) € Ix, so that we have a 


homomorphism Kk * ST x«. The image of this homomorphism is called the group 
of principal ideals, and is denoted by Px. The quotient group Clz = In /Px 
is called the ideal class group. The following result is another corollary of 
Minkowski’s lemma. 


Theorem 4.11. The group Clr is finite. 


The order |Clx| = hx is called the class number of K. 

In order to prove the theorem we note that each ideal class can be 
represented by an integral ideal (replacing if necessary a by Ma with an 
appropriate integer M, and so getting rid of denominators). According to 
Minkowski’s lemma (see §4.1.5) there exists a non-zero element a € a such 
that |Na| < (2)” /|Dx|Na. We have aOx C a because a is an ideal, ie. 
Ox C ata. We see now that the index (a~'a : Ox) = (Ox : aa?) is 
bounded by the constant (2)” \/|DxK|, because 


(Ox : aa-1) = |N(a)|Na-? < (2) * Jr. 


If a’ is an arbitrary fractional ideal containing Ox and (a’ : Ox) = r then 
r-1Oxn Da’ D Ox. But it is obvious that the number of intermediate ideals 
a’ between r~!Ox and Ox is finite. The theorem follows, in view of the fact 
that r can take only a finite number of values. 

As we shall see below, this theorem and Dirichlet’s unit theorem not only 
have similar proofs, but can be incorporated as parts of a more general result 
on the structure of the idele class group (cf. [Chev40], [Wei74a]). 

The class number plays an exceptionally important role in number theory. 
For example the statement hx = 1 is equivalent to saying that Ox is a unique 
factorization domain. Another example is that the theorem of Kummer from 
84.1.1 on the first case of Fermat’s Last Theorem can be extended to all 
prime exponents n with the property that hx is not divisible by n, where 
K = Q(exp(27i/n)) is the corresponding cyclotomic field. 

There have been a number of experimental and empirical observations 
of class groups of number fields made over the years. H. Cohen and H. W. 
Lenstra, Jr. in [CoLe83] introduced a heuristic principle that succeeded in pre- 
dicting the statistical distribution of ideal class groups of imaginary quadratic 
number fields and totally real abelian number fields. Many numerically veri- 
fied observations are a precise consequence of the Cohen-Lenstra conjecture, 
cf. e.g. [Lee02], where a relation with Leopoldt’s Spiegelungssatz (cf. [Leo58]) 
is discussed. 
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4.2.3 Decomposition of Prime Ideals in Extensions 


If K is a number field with ring of integers Ox, and p is a prime number, 
then (p) = pOx can be decomposed into a product of prime ideals of Ox: 


(p) = pips? ---+- phe. (4.2.1) 


The form of the decomposition (4.2.1) for primes p is one of the most im- 
portant characteristics of K; if say K/Q is a Galois extensions, then K is 
uniquely determined by the set of primes p satisfying (p) = pipe--+-- Pn, 
where n = [K : Q] (the product of n distinct primes). If this is the case for 
p, we say that p splits completely in K. For a general number field it is diffi- 
cult to determine the precise form of the decomposition (4.2.1) for all p. This 
problem is related to the deepest questions of algebraic number theory (“non-— 
commutative class field theory”, see §6.4). However, for Abelian extensions K, 
i.e. Galois extensions K’/Q whose Galois group G(/Q) is commutative, this 
decomposition is known. We shall give the precise form of the decomposition 
for quadratic fields K = Q(Vd) and cyclotomic fields K = Q('/1). This is 
done by a general method, applicable to any extension R C S of commuta- 
tive rings, where it is supposed that S is a finitely generated R-module. In 
this case each element a € S' is a root of a normalized (monic) polynomial 
f(X) € R[X]. For example, one could take f(X) = X" +ay_1X""!+---+a9, 
a; € R (the characteristic polynomial). Let p be a maximal ideal in R. Denote 
by @ the image of a in the quotient ring S/pS. 


Theorem 4.12 (Theorem on the Decomposition of a Maximal Ideal). 
Suppose that for an element a € S one has S/pS = (R/p)[a] and n = 
deg fa(X) = dimr/, S/pS. Choose normalized polynomials gi(X),...,9r(X) € 
R[X] such that 


fo(X) = gi(X)% «+--+ go(X)** (mod pR[X]) (4.2.2) 


where g;(X)(mod pR[X]) are distinct and irreducible in (R/p)[|X]. Then the 


ideals 3B; = (p, gi(a)) are maximal, and the following decomposition holds: 
pS = Py ----- Per. (4.2.3) 
The maximality of $8; follows from the isomorphism: 


S/Bi = R[X]/(gi(X), Pp) = (R/p)[XI/(gi(X)) 


and from the irreducibility of g;(X)(mod pR[X]); the decomposition (4.2.3) 
is deduced from an analogue of the theorem on tensor products of fields, see 
4.1.2 (or from the Chinese Remainder theorem): 


Tr 


S/pS = S @r (R/p) = [][(R/p)[X]/(Gi(X)). 


i=l 
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Example 4.13. a) Quadratic Fields (see [BS85], Chapter 2). For a quadratic 
extension K = Q(vVd) (d € Z being square free), O = Ox = Z[w] where 


ee Te X) = X*—X+4+(d—1)/4 and Dx =d for d= 1(mod 4), 


= Vd and Dx = 4d for d = 2,3(mod 4). 


The result on the decomposition of primes can be conveniently stated in terms 

of the quadratic character yx of K. By definition yx is the unique primitive 
Dirichlet character of order 2 modulo |Dx| such that y(—1) = sgn Dx. It can 
be written explicitly as follows 


(a). if d = 1(mod 4) 
x(n) = 4 (-e-Y? (4), if d= 3(mod 4) 
2 / : 
(-1)@ “D/SHEDa-D/4 (ig) , if d = 2d’,d' = 1(mod 2). 


Then p decomposes in Ox as follows: 


pp’, p#P', and Np=Np'=p for xx(p) = 1, 
pOx = <p, Np =p? (ie. p remains prime) for yx(p) = —1, 
p?, Np=p for xx (p) = 0. 


In order to prove these decompositions one applies the above theorem with 
R=Z, S = Ox, a = w, using the decomposition of the corresponding 
quadratic polynomial f,,(X) mod p, which either has two distinct roots over 
F,, or is irreducible, or has a double root over F, in the cases when xx (p) = 1, 
XK (p) = —1 or xx (p) = 0 respectively. This result can be elegantly rewritten 
as an identity for the Euler factors of the Dedekind zeta—function (cf. §6.2.3 
below): 


[[ @—Np~*) = (1—p*)(1-x«(p)p™*) (8 € ©). (4.2.4) 


p|(p) 


Example 4.14. b) Cyclotomic fields. K = Km = Q(¢m). We use the fact Ox = 
Z[¢m]. Consider the extension Z[¢,,] > Z, and take for f,(X) the cyclotomic 
polynomial ®,,(X) (see 4.1.2). The proof that Ox coincides with Z[¢m] is 
rather fine but elementary; it is based on a calculation of the discriminant of 
R=Z([¢,,] which turns out to be equal to 


(—1) P0™)/2 p00) | Il pri e=1) 


p|lm 


see [BS85]. 
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4.2.4 Decomposition of primes in cyslotomic fields 
Theorem 4.15. a) Let p{m, then 
pR=pi---+- Pr, Npi= pl, 


where p; C R are distinct prime ideals, and the number f is equal to the order 
of pmod m in (Z/mZ)*, f-r = p(m). 
b) If m= py +--+ pe then 


piR = (ph ove pe)? , Np =p, 


ajy-1 


where v(p;") = ps* “(pi — 1), f’ is equal to the order of the element p; mod 
rip; in Zi (mp; 22, ond fr! = olmpy*). 


Proof. Note first that for prime ideals p C Ox the number f = log, Np 
coincides with the degree of the corresponding extension of residue fields: 
f = [((Ox/p) : Fp], and thus f is the order of the Frobenius automorphism 
x +> x, generating the cyclic Galois group G((Ox/p)/F,). Applying the 
theorem on the decomposition of maximal ideals we see that it suffices to find 
the form of the decomposition of the cyclotomic polynomial ®,,(X) mod p 
in F,,[X] into irreducible polynomials. 

It follows also that the form of the decomposition depends in this case only 
on pmod m. In particular, p splits completely in K — > p=1mod m.A 
useful observation is that the decomposition of (p) in Ox,, is fully determined 
by the action of the Frobenius endomorphism Fr, on the finite ring Ox /(p), 
so that in the case p{m this endomorphism may be regarded as the element 
of the Galois group G(K,,/Q): 


(Erp : Gm > CP) <> pmod me (Z/mZ)* = G(Km/Q). 


It is useful for further applications to reformulate theorem 4.15 using the 
Dirichlet characters y : (Z/mZ)* — C*. The conductor of y is by definition 
the least positive integer m(x) such that x can be defined modulo m(y), i.e. 
to which x factors through the natural projection 


(Z/mZ)* (Z/m(x)Z)* 2C%. 


The corresponding character ~9 mod m(x) is called the primitive Dirichlet 
character associated with y. Theorem 4.15 is equivalent to the following iden- 
tity (cf. §6.2.3 below): 


T[]a-Ne)= J] @-xolr)p*) (se). (4.2.5) 


p|(p) x mod m 


Indeed, the theorem implies that the left hand side has the form (1 — p~f*)" 
for p {m, and (1—p~f *)" for p|m, where f is equal to the order of pmod m 
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in (Z/mZ)*, f-r = p(m), f’ is equal to the order of the element p; mod m’ 
(with m’ = mp; “') in Z/m’Z, and f’-r’ = —y(m’). It remains to verify the 
equation 


aQ-Tf)= J] (-x(p)7). (4.2.6) 


x mod m 


Let jy be the group of roots of unity of degree f, then 1—TY = Tes (1-—¢T). 
Equations (4.2.5) and (4.2.6) follow from the fact that for any ¢ € yy there 
are exactly r characters y(mod m) such that x(p) = ¢ (cf. [BS85], [La70], 
[Se70]). 


4.2.5 Prime Ideals, Valuations and Absolute Values 


An alternative approach to the theory of divisibility has arisen from the notion 
of the z-order ord,a of an element a 4 0, a € R for a prime element 7 
of a unique factorization domain R: here ord,a is defined to be the largest 
exponent of a dividing a in R, so that there is a decomposition: a = ent? . 
--..akr in which k; = ordz,a, ¢ € R* is a unit. 

The function ord, can be uniquely extended to the field of fractions K of 
Ras a homomorphism ord, : K* — Z with the following properties: 


1) Va,b € K™ ord, (ab) = ord,a + ord,b, 

2) Va,b € K* ord,(a + 6) > min(ord,a, ord,b), 

3) a divides b in R = > Vr ord,za < ord,zb, 

4) trR= {ae R | ord,a > 0} is a prime ideal of R, 
5) R={xe K* | Va ord, x > OF} U {0}. 


Generalizing, for an arbitrary field K the notion of a valuation v is introduced 
as a function v: K* — Z satisfying the conditions 


1) Va,b € K* v(ab) = v(a) + v(0), 
2) Va,b € K* v(a +b) > min(v(a), v(0)). 


More often one uses instead of v a multiplicative absolute value: for a fixed p, 
0<p<1 put |a|p. =p", [Olp. =0. 

Definition 4.16. An absolute value |-| of a field K is a real-valued function 
xt |x| with non-negative values, such that 


1) Va,b € K™ |a-b| = a) - |dI, 
2) a,b € K* |a+9| < |al + [4], 
3) jal =O = c=0. 


An absolute value is called non—Archimedean iff instead of 2) the following 
stronger inequality is satisfied 
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2’) Va,be K* |a+b| < max(|al, |b]). 

Thus the function | -|,,, is a non—Archimedean absolute value. An absolute 
value of the type |-|),. is called a discrete absolute value. An example of such 
an absolute value is given by the p-adic absolute value |a/b|, = poe? ordne 
(a,b € Z) of the field Q. The usual absolute value |x| of  € Q C R is an 
Archimedean absolute value of Q. 


If |- | is a non-Archimedean absolute value of K, then the subset O = 
{x € K | |x| < 1} isa ring with a unique maximal ideal p = {x € K | |z| < 1}. 
Such rings are called valuation rings. For the discrete absolute value |-| = |-|p,. 


corresponding to a valuation v, the notation Ri) = O, pi.) = p is used, and 
Pv) is a principal ideal generated by any 7 € K such that v(m) = 1. 

Now one can define a divisibility theory on an integral domain R with field 
of fractions K with the help of a family of valuations »’ = {v} such that the 
following properties are satisfied: 


1) a divides b in R= > Wu € X, v(a) < v(b); 

2) for all a € K™ one has v(a) = 0 for all but a finite number of v € XY; 
3) the set Ri, = {x € K | v(x) > 0} U {0} uniquely determines v; 

4) R= Avex Ry). 


If such a family »’ exists then the group of divisors D = Dy is defined to 
be the free Abelian group with basis »’. Its elements are written additively 
as finite formal sums )>, k;v; or multiplicatively [], p4’, where only finitely 
many of the k; are non zero. The following homomorphism is defined 


div: K* + D, div(x) = II note), 
ves 
This homomorphism is called a divisor map on R. 

The class of rings with a divisibility theory is larger than the class of 
Dedekind rings, and it admits a purely algebraic characterization as the class 
of Krull rings. Notice that in order to construct valuations, not all of the 
prime ideals of the ring are used. If we try to define for a prime ideal p C R 
a valuation v by putting for a € R, v(a) = min{n >0|ae€p"}, then we 
succeed only when the localization Rp of R with respect to p is a Noetherian, 
integrally closed ring with a unique maximal ideal, where 


Ry = {x =a/b| a,b€ R,b¢ p}. 


The idea of using valuations rather than prime ideals, which arose from 
the study of algebraic numbers, has turned out to be very fruitful in algebraic 
geometry. In turn, developments in algebraic geometry have lead to a number 
of inventions in number theory (cf. Chapters 5 and 6). 

To conclude this section we remark that all absolute values of Q either have 
the form |z|* (0 < a <1, |a| being the usual absolute value of « € Q C R), or 
have the form |x|} (a > 0, where |x|, is the p-adic absolute value of x € Q). 
This result is due to Ostrowski, cf. [BS85], [Chev40]. 


4.3 Local and Global Methods 


4.3.1 p-adic Numbers 


The idea of extending the field Q appears in algebraic number theory in var- 
ious different guises. For example, the embedding Q C R often gives useful 
necessary conditions for the existence of solutions to Diophantine equations 
over Q or Z. The important feature of R is its completeness: every Cauchy 
sequence {a,,}>-_, in R has a limit a (a sequence is called Cauchy if for any 
€ > 0 we have ja, — a»| < € whenever n and m are greater than some large 
N = N(e)). Also, every element of R is the limit of some Cauchy sequence 
{an}, with an € Q. 

An analogous construction exists using the p-adic absolute value | - |, of 


Q (see §2): 
|-|p:Q—>Rso ={xER |x >} 


a/b, = prsr?—rere Ol, = 0, 


where ord,a is the highest power of p dividing the integer a. This general 
construction of “adjoining the limits of Cauchy sequences” to a field k with an 
absolute value | -| leads to a completion of k. This completion, often denoted 
k, is complete, and contains k as a dense subfield with respect to the extended 
absolute value | - |, [BS85], [Kob80]. 

As was noted at the end of §2, all absolute values of Q are equivalent either 
to the usual Archimedean absolute value, or to the p-adic absolute value. 
Thus any completion of Q is either R, or Q,, the field of p-adic numbers, i.e. 
the completion of the field of rational numbers Q with respect to the p-adic 
absolute value. Using the embeddings Q — R and Q <= Q,), (for all primes 
p) many arithmetical problems can be simplified. An important example is 
given by the following Minkowski-Hasse theorem [BS85], [Cas78], [Chev40]: 
the equation 


Q(#1,22,---,2n) = 0, (4.3.1) 


given by a quadratic form Q(21,22,...,2n) = a, AijgX5X;, aij © Qhasa 
non-trivial solution in rational numbers, iff it is non-trivially solvable over 
R and over all Q,. There are very effective tools for finding solutions in Qp. 
These tools are somewhat analogous to those for R such as the “Newton - 
Raphson algorithm”, which in the p-adic case becomes Hensel’s lemma. 

The simplest way to define the p-adic numbers is to consider expressions 
of the type 


Cae ap koe (4.3.2) 


where a; € {0,1,....p—1} are digits to the base p, and m € Z. It is convenient 
to write down a as a sequence of digits, infinite to the left: 


4.3. Local and Global Methods 135 


m—l1 zeros 


——Ss . 
A= ¢°++Gm414m000...0(p), if m> 0, 
**@149-€_1°***Am(p); ifm <0. 


These expressions form a field, in which algebraic operations are executed 
in the same way as for natural numbers n = ag + aip+...a,p", written 
as sequences of digits to the base p. Consequently, this field contains all the 
natural numbers and hence all rational numbers. For example, 


ee ee 
1l—p 


(p—1) + (p—1)p+ (p—1)p* +--+ =---(p—1)(p— Iq; 


ap _ : 
oan ie ao + agp + agp” + +++ = +++ A9a0ao(p)- 
For n € N the expression for —n = n- (—1) of type (4.3.2) is obtained if we 
multiply the above expressions for n and for —1. Generally, for a € Q write 
a =c— ¢, where a,c € Z,b€ N,0 <a < b, ie. a/b is a proper fraction. 
Then by an elementary theorem of Euler, p®) — 1 = bu, u € N. Hence 
a au 


~b pe) —q’ 


and au < bu = p” —1, r = ¢v(b). Now let au be written to the base p as 
dy—1***Ag(p), then the expression of type (4.3.2) for a is obtained as the sum 
of the expression for c € N and 


r digits r digits 
a ec ————— 
Ss =... *Apar—1 see apar—1 eee Q0(p)- 
For example, if p = 5, 
9 5 5 + 2232 
7 7 1—56 EOe 
so that 
2232 = 324125) =3-5°+2-5°+4-57+1-542, 
thus F 
oOo 
es 324120324120324122(5). 


It is easy to verify that the completion of Q with respect to the p-adic 
metric | - |, can be identified with the described field of p-adic expansions 
(4.3.2), where |a|, = p™ for a as in (3.2) with am 4 0 (see Koblitz N. (1980)). 

It is curious to compare the expansions (4.3.2) infinite to the left with the 
ordinary expansions of real numbers a € R, infinite to the right: 


OQ = Am Om—1° ++ A9-G—1°** = Am 10™ + am—110™-1 +--+ a9 +a_11071 +--+, 
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where a; € {0,1,--- ,9} are digits, a, 4 0. These expansions to any natural 
base lead to the same field R. Also, a given @ can possess various expressions of 
this type, e.g. 2.000--- = 1.999---. However, in the p-adic case the expressions 


(4.3.2) are uniquely determined by a. This fact provides additional comfort 
when calculating with p-adic numbers. 

The field Q, is a complete metric space with the topology generated by 
the “open discs”: 


U.(r) ={x | |e-—al<r} (2, a€Q, r>0) 


(or “closed discs” Da(r) = {x | |e —a| <r}). From the topological point of 
view, the sets U(r) and D,(r) are both open and closed in Q,. 

An important topological property of Q, is its local compactness: all discs 
of finite radius are compact. The easiest way to show this is to consider any 
sequence {a,,}>~_, of elements a, € D,(r) and to construct a limit point. 
Such a point may be found step-by-step using the p-adic digits (4.3.2). One 
knows that the number of digits “after the point” is bounded on any finite 
disc. In particular, the disc 


Zy = Do(1) = {x | lz|p < l}= {x = ao +aip + agp? +--+} 


is a compact topological ring, whose elements are called p-adic integers. Z, is 
the closure of Z in Q,. The ring Z, is local, i.e. it has only one maximal ideal 
pZ, = Up(1) with residue field Z,/pZ, = F,. The set of invertible elements 
(units) of Z, is 


Zy = Z,\pLy = {x | |tlp = 1} = {w@ = a0 + arp + aap? +--+ | ao AO}. 
For each x € Z, its Teichmiiller representative 
w(x) = lim 2?” 
n— oo 


is defined. This limit always exists and satisfies the relations: w(x)? = w(z), 
w(x) =x mod p. For example, if p = 5, we have 


(1) = 1; 

(2) =241-542-5741-5943.54...; 
w(3) =34+3-542-5743-5341-544...; 
(4)=4 
(5) =0 


4.544-5744.53 44.544... = -1; 


The ring Z, can also be described as the projective limit 


lim Z/p"Z 


of rings A, = Z/p"Z with respect to the homomorphisms y,, : A, — An—1 of 
reduction modulo p"—!. The sequence 
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OO An Anat > Ay Ay (4.3.3) 


forms a projective system indexed by positive integers n > 1. The projective 
limit of the system is defined as a ring 


lim An, 


n 


with the following universal property: there are uniquely defined projections 


T, : lim A, > Ay, 
ra 


such that for an arbitrary ring B and a system of homomorphisms w,, : B — 
Ay, compatible with each other under the condition: Yp_1 = Yron for n > 2, 
there exists a unique homomorphism y~ : B — A such that v, = mw (cf. 
[Kob80], [Se70]). Note that the uniqueness of A is implied from its existence 
by abstract nonsense. Hence for the ring Z, it suffices to define the projections 
Tn : Lp — Z/p"Z, and to check the universal property using digits as in 
(4.3.2). 
Analogously, 
ZX = lim(Z/p"Z)*, 


and one can describe the structure of the multiplicative group Q.- 
Put vy = 1 for p > 2 and v = 3 for p = 2, and define 


U =U, = {a € Z,|z =1 mod p”}. 


Then there is an isomorphism U > Z, from the multiplicative group U; to the 
additive group Z,, which is given by combining the natural homomorphism 


U > limU/UP" 


with the special isomorphisms 
Opn :U/U?” % Z/p"Z, 
given by 
Qpn((1 + p”)*) =a mod p” (a€ Z). (4.3.4) 


One easily verifies that (4.3.4) is well defined and gives the desired isomor- 
phism. Therefore, the group U is a topological cyclic group, and 1+ p” can be 
taken as its generator. Another proof of this fact is obtained using the power 


series 
CO 


log(1 =Sv(-yrn&, 
og(L +2) = ays 
which defines an isomorphism from U onto pZ, 
One has the following decompositions 


QX = p* x ZX, ZX & (Z/p’Z)* x U. (4.3.5) 
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4.3.2 Applications of p-adic Numbers to Solving Congruences 


The first appearances of p-adic numbers, in papers by Hensel, were related to 
the problem of finding solutions to congruences modulo p”. An application of 
this method by his student H.Hasse to the theory of quadratic forms has lead 
to an elegant reformulation of this theory, without the use of considerations 
over the residue rings Z/p"Z. These considerations are tiring because of the 
zero—divisors in Z/p"Z. From the above presentation of Z, as the projective 
limit 


lim Z/p"Z 
it follows that for f(71,...,2%) € Zp[r1,...,%p], the congruences 
f(a1,---,2n) = 0(mod p”) 
are solvable for all n > 1 iff the equation 


f(a1,---,%n) =0 


is solvable in p-adic integers. Solutions in Z, can be obtained using the fol- 
lowing p-adic version of the “Newton - Raphson algorithm”. 


Theorem 4.17 (Hensel’s Lemma). Let f(x) € Z,[x] be a polynomial in 
one variable x, f’(x) € Z,[ax] its formal derivative, and suppose that for some 
ag € Zp the initial condition 


|f(@0)/f'(a0)7|lp <1 (4.3.6) 


is satisfied. 
Then there exists a unique a € Zy such that 


f(a) =0, Ja—ao| <1. 
We prove this by induction using the sequence of “successive approxima- 


tions”: 
=] Flom) 
f'(Q@n—1) | 
Taking into account the formal Taylor expansion of f(x) at « = anj_1 one 
shows that this sequence is Cauchy, and its limit a has all the desired prop- 
erties (cf. [CF67], [BS85], [Se70]). 

For example, if f(z) = 2?~! — 1, then any ao € {1,2,...,p — 1} satisfies 
the condition |f(ao)|p < 1 At the same time f’(ao) = (p—1)a~* #0 mod p, 
hence the initial condition (3.6) is satisfied. The root a coincides then with 
the uniquely defined Teichmiiller representative of ag: a = w(ag). 

The method described is applicable to polynomials in many variables, 


although for more than one variable the p-adic solution is not unique (cf. 
[BS85], [Kob80], [Se70]). 


An = An-1 
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Another interesting application of Hensel’s Lemma is related to describing 
the squares of the field Q,: for an arbitrary 


a=p™-v €Q (meZ, vEZ>), 


the property that a is a square is equivalent to saying that 


a) for p > 2, m € 2Z, and v= vmod p € (Z/pZ)*? (ie. (2) = 1, where 


(3) is the Legendre symbol (see §1.1.5)); 


b) for p= 2, m € 2Z and v=1 mod 8. 


The solvability of z? = a in Q, under conditions a) and b) is implied by 
Hensel’s Lemma, and the necessity of these conditions is deduced more triv- 
ially from considerations modulo p and modulo 8. 

As a corollary we give the following description of the quotient group 


Q7 /0z? 


a) for p > 2 it is isomorphic to Z/2Z x Z/2Z with the system of coset 
representatives {1,p,v, pv}, (2) =-1; 

b) for p = 2 it is isomorphic to Z/2Z x Z/2Z x Z/2Z with the system of coset 
representatives {+1,+5,+2,+10}. 


a 


4.3.3 The Hilbert Symbol 


In this subsection we allow p = oo, in which case we write Q, for the field of 
real numbers R. The Hilbert symbol (or norm residue symbol) 


(1) =(38) om 


is defined for a,b € QF by 


(a,5) 1, if the form az? + by? — z” has a non-trivial zero in Q,; 
a, = . 
—1, otherwise. 


It is clear that (a,b) depends only on a and b modulo squares. There is a 
asymmetric form of the definition, namely (a,b) = 1 iff 


a= z" — by for some y, z € Qy. (4.3.7) 


Indeed, from (4.3.7) it follows that (1,y,z) is a non-trivial zero of the 
quadratic form ax? + by? — z?. Conversely, if (xo, yo, 20) is a non-trivial zero, 
then one can obtain all other zeros using a geometric trick in which one draws 
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secants from the point (20, yo, 20) in all directions given by vectors with co- 
ordinates in Q, (see §1.2.3). Using this method we may reduce to the case 
xo #0. Then (yo/2%0, 20/0) satisfies (4.3.7). 


Local properties of the Hilbert symbol: 


a) (a, b) = (0, a); (4.3.8) 
b) (a1@2, b) = (a1, 6)(a2,b), (a,b1b2) = (a,b1)(a,b2); (4.3.9) 
c) if (a,b) = 1 for all b, then a € Q*?; (4.3.10) 
d) (a, —a) = 1 for all a; (4.3.11) 
e) if p # 2,00 and |a|, = |b|p = 1, then (a,b) =1. = (4.3.12) 


In particular for a fixed b, the a for which (a,b) = 1 form a multiplicative 


group. Equation (4.3.7) expresses the fact that a is a norm from the quadratic 
extension Q,(Vb)/Q, (ef. [BS85], [Cas78], [Chev40], [Se70]). 

A calculation of the Hilbert symbol makes it possible to solve completely 
the “global” question on the existence of non-trivial rational zeros of quadratic 
forms (in view of the Minkowski-Hasse theorem). If, say 


Q(x,y,z) = ax* + by? + cz” (a,b,c€Q, c# 0), (4.3.13) 


then (4.3.13) has a non-trivial zero over Q iff (—a/c,—b/c), = 1 for all p 
including p = oo. This criterion is very effective because for almost all p we 
have |a|, = |b|p = 1, whence (a,b), = 1 for p # 2,00 in view of (3.8e). We 
give a table of the values of (a, b) p: 


Table 4.1. The Hilbert symbol for p > 2. Here v denotes an element v € Z such 
that (2) =-—l,ande=1iff-le Qx? (i.e. iff p=1mod 4). Otherwise « = —1 


a| 1 0) p| pv 
b 
1 1 1} +1 +1 
v +1} +1] -1 -1 
p +1]-1] ¢ —E 
pu |+l|—1]|—-e E 


A global property of the Hilbert symbol (the product formula). Let a,b € 
Q*. Then (a,b), = 1 for almost all p and 


I] @p=1- (4.3.14) 
p including oo 


Formula (4.3.14) is equivalent to the quadratic reciprocity law (see part I, 
81.1.5). Indeed, by (4.3.12) one has |a|, = |6|,p = 1 for all but a finite number 
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of p, hence (a,b), = 1 for p 2,00 in view of (4.3.12). Denote the left hand 
side of (4.3.14) by f(a,b), then by (4.3.9) one has 


f(a1az,b) = f(ar,b)f (a2, d), 
f(a, bib2) = f(a, b1) f(a, b2), 


and one verifies that f(a,b) = 1 when a and b run through the set of generators 
of the group Q*: —1, 2, —q an odd prime. 


Table 4.2. The Hilbert symbol for p = 2. 


al} 1 5 —1 —5 2 10 —2 —10 

b 

1 +1}41] 41 +1 }4+1} 41 } 41 +1 
5 4+1}41] 41 +1 1 1 1 1 
—1l +1)/+1; -1 —1 ;+1/ +1) -1 —1 
—5 +1} +41 1 1 1 1} +1 +1 
2 1 1 +1 1 }4i1} —-1 } 41 —1 
10 +1 1 +1 1 dei) pe ed +1 
—2 +1);-1]; -1 +1 ;/4+1] -1/] -1 +1 
—10 |+1]-1}] -1 +1 }-1] +1] 41 -1 

In what follows we shall need an analogous product formula for the nor- 


malized absolute values | - |p. 


The product formula for absolute values. Let a € Q*. Then |a|, = 1 for all 
but a finite number of p, and 


I] lek =1- (4.3.15) 


p including oo 


Indeed, if a € Q”, then 


Cia II prr(@) 


p#oo 


where v,(a) € Z and vp(a) for all but a finite number of p. The product 
formula now follows from the identities: 


lalp = p~°?™ (for p # 00), 


es = II prela), 


pFoo 


In 84.3.6 we discuss the global properties of absolute values in more detail. 
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4.3.4 Algebraic Extensions of Q,, and the Tate Field 


If K is a finite algebraic extension of Q, then K is generated over Q, by some 
primitive element a € K. The element a is a root of an irreducible polynomial 
of degree d = [K : Q,], 


f(x) = 24 + ag-127' +... +40 € Q,[g]. 


The absolute value | - |, has a unique extension to K defined by 


lly = (INx/o, (8)|p)*/%, (4.3.16) 


where Nx g, (3) € Q» is the algebraic norm of the element 6 € K. Formula 
(4.3.16) defines a unique extension of | - |, to the algebraic closure Q,, of 
Q». The uniqueness of this extension can easily be deduced from the local 
compactness of K as a finite-dimensional Q,—vector space: all of its norms 
over Q,, are equivalent (the same thing happens for R”). It then follows from 
the multiplicativity of absolute values that any two must coincide. 

The function ord, can then also be extended to Q, by setting ordpa = 
log, |@|p- Formula (4.3.16) implies that ord, * is an additive subgroup of 
iZ. Hence ord, K* = 1Z for some positive integer e dividing d. We shall call 
e the ramification indez of the extension K/Q,. 

Put 


Ox ={vEK||zlp <1}, px ={v€ K||zlp <1}. (4.3.17) 


Then px is the maximal ideal in Ox and the residue field Ox /px is a finite 
extension of degree f of Fp. One has the relation d = e- f, in which f is 
called the inertial degree of the extension. For each x € Ox its Teichmiiller 
representative is defined by 


w(x) = lim oP” w(x) = «(mod px), (4.3.18) 


n— oo 
and satisfies the equation 
f 
w(x)? = w(x). 
The map w provides a homomorphism from the group of invertible elements 


OK = Ox\px = {ev € K||x]p = 1} 


of Ox onto the group of roots of unity of degree p/ — 1 in K, denoted by 
Hs —1- One also has an isomorphism 


(Ox/pK)* > Myra C OF. (4.3.19) 


The structure of the multiplicative group K™ can be described analogously 
to (4.3.5): if [kx : Q,] = d, then 
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K* =n” x OX, O% = (Ox/px)* x Ur, (4.3.20) 


where 7 is a generator of the principal ideal px = 7Ox (i.e. any element 
nm € K™* with ordpa = 1/e), 


Ux = {x € O%||z — 1p < 1} = Di (1; K). 


The structure of the group Ux is then described as a direct product of d copies 
of the additive group Z, and a finite group consisting of all p-power roots of 
unity contained in K. 


Example 4.18. If e = 1 then the extension K is called unramified. In this case 
f =dand the Teichmiiller representatives generate K over Q,. Therefore 


K=Q,(1"%), N=p?-1. 


On the other hand, if e = d then the extension K is called totally ramified. For 
example, if ¢ is a primitive root of unity of degree p”, then Q,(¢) is totally 


ramified of degree d = p” — p"~!, and we have that 
1 
ord 1)= ; 4.3.21 
p(¢ — 1) Sees ( ) 


The Tate Field. For purposes of analysis it is convenient to embed Q, into 
a bigger field, which is complete both in the topological and in the algebraic 
sense. This field is constructed as the completion C, = Q, of an algebraic 
closure Q, of Q, with respect to the unique absolute value satisfying the 
condition |p|, = ze The proof that C, is algebraically closed is not difficult. 
We shall use the notation 


Ov=1f6 Col al, s1). p= we Cy ile, = Th 


Note that the O, and p are no hae compact, so the field Cp is not locally 
compact. We Hise have that O,/p = F, is an algebraic closure of F,. 


4.3.5 Normalized Absolute Values 


If F is a locally compact field, then its topology can by given by an absolute 
value. This fact is deduced from the existence of a Haar measure 4 on a 
locally compact group G, i.e. a measure invariant under group shifts «> gx 
(z,g € G): 


[t@ ) dla )= f F@ ) du(ge) = = forte) dy) 


for all integrable functions f : G — R. This measure is defined uniquely up to 
a multiplicative constant. However, we do not need a general construction of 
dus (cf. [Wei40]), and we point out only some concrete examples. 
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If G = R (the additive group) then dyu(x) = dx (Lebesgue measure), and 
d(x +a) =dz,a€R. If G=R*% (the multiplicative group), then du = ©. 

IfG=C, z=a+iyeC, then dy = dz dy. 

If A’/Q, is an extension of degree d, and g = p! is the number of elements 
of the residue field Ox /px, then the measure dy on the additive group K 
is uniquely determined by the number dou du = pw(Ox) = c > 0; one has 
u(a+ pr) = cq, because the measures of all of the sets a + px are equal 
and Ox = U (a+px). More generally, for alln € Zand a € K one has 


amod pr 
wa+ pr) =cq”. (4.3.22) 


Any measure dy on the additive group of a locally compact field F’ defines 
an absolute value || - || : F — Rso: for a € F™* the number |{a|| is defined as 
the multiple, by which the two Haar measures du(x) and dyi(ax) on F' differ: 


u(aU) = |lal|u(U), (4.3.23) 


where U is an open subset of positive measure, (UV) = f,, du(). The multi- 
plicativity property 


lloB|| = llell -[16l| (a, Be F*) (4.3.24) 


follows immediately from definition (4.3.23). If the topology of F is non- 
discrete, i.e. not all subsets are open, then one verifies, that discs of finite 
radius D,(r) = {x € F | ||x — || < r} are compact, and the function || - || is 
continuous. Hence this function is bounded on such discs. In particular, 


|1 + al] < C for |lal| <1 (4.3.25) 
for a positive constant C' > 1. From (4.3.25) it follows that 
Va,8€F |la+ 6l| < Cmax(|lall, ||5|l) (4.3.26) 


which is weaker than that in the definition of an absolute value from §4.2. 
These functions are called generalized absolute values. If for example F = C, 
and U = {z= ax+iy €C | |z| = 1}, then p(wV) = |w|?u(U), where |w|? = wu, 
and (4.3.26) is satisfied with C = 4. However, if for all n € N one has ||n|| < 1, 
then C = 1, so that || - || is a non—Archimedean absolute value. 

In particular, for an extension K/Q, with [K : Q,| = d put 


U=Ox, a=7"v (mEZ,ve Of), 


where 7 is a uniformizing element, px = (7). We have |la|| = q~™” ne 


Since p = 7°u for some u € Of, we obtain 


\Ipll = u(pPOw)/u“(Ox) = |Ox/pOK|"* = p~*. 


This proves the formula d= e- f. 


=p 
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4.3.6 Places of Number Fields and the Product Formula 


We shall call two (generalized) absolute values || - ||; and || - ||2 of a field F 
equivalent if ||x||1 = ||x||§ for all « € F and for a constant c > 0. A class of 
equivalent absolute values is called a place of F’, and it will be denoted by v. 
The symbol F, denotes the corresponding completion (with respect to one of 
the equivalent absolute values in v). 

The theorem of Ostrowski (see §4.2) says that every place of Q is either 
v = p (p a prime), or v = oo. If the place v is non—Archimedean, then we 
let the same symbol v denote the valuation of F' normalized by the condition 
o(F*) =Z. 

We list places of finite extensions F' of Q. To do this we construct all 
possible extensions to F’ of absolute values on Q, since the restriction to Q of 
any absolute value on F' is an absolute value of Q. More generally, let F'/k 
be a finite separable extension of k with an absolute value | - |, (for example, 
k =Qand v=porv=ov); f(x) € ka] the irreducible polynomial of degree 
n = [F : k] of a primitive element a for F over k, and let 


f(x) = II 9j(2) (99 (a) € L[a]) (4.3.27) 


be the decomposition of f(z) into polynomials irreducible over L, where L = 
k, is the completion of k with respect to v. 

In view of the theorem on tensor products of fields (see §4.1.3), there is a 
ring isomorphism 


F@,L2][L,, (4.3.28) 


j=l 


where L; = L[x|/(g;(x)) is the finite extension of L containing F via A, : 
PoF@.b > L; 

In §4.3.4 we saw that there exists a unique absolute value on L; extending 
|-|, from L = k,, where it is canonically defined as on the completion. Let 
us denote this extended absolute value on L; by the same symbol | - |, and 
define an absolute value | - |,,; on F using the embedding 2, by putting 


[Blog = |Az(B)lo- (4.3.29) 


It is not difficult to verify that all the |-|,,; are different, and that they are the 
only extensions of |-|,, from & to F’, such that (4.3.28) becomes an isomorphism 
of topological rings. Thus there are no more than n = [F' : k] extensions of 
an absolute value |- |, of k to F. These extensions are described explicitly 
by (4.3.29), assuming one knows the decomposition (4.3.27). Formula (4.3.16) 


shows that 
Aj()lo = 4¥/INg,/2Ag(9))lor 
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where n,; = [L; : L] = deg g;(zx) is the local degree. 
To obtain the normalized absolute value || - ||,,; we put for G € F” 


log = IN stg (2) Io. 
Then for all 6 € F™ one has: 


m 


Il Fllo,3 = INF/x(2)|v- (4.3.30) 


j=l 
This follows from Npyx(8) = [Tj2.Nz,/2(Aj(@)) in view of §4.1.3. Product 


Formula for Normalized Absolute Values. Let k/Q be a finite extension, a € 
k*, and let |-|, run through the normalized absolute values of k. Then |a|, = 1 
for all but a finite number of v, and the following product formula holds 


[[leb =1. (4.3.31) 


This is easily deduced from formula (4.3.30), in which we put k/Q instead 
of F'/k and notice that Nz/g(a) € Q*. It then suffices to apply the already 
proven product formula for Q, see (4.3.15). 


Global Fields. We use the term “global field” to refer to either a finite ex- 
tension of Q (an algebraic number fields) or a finite, separable extension of 
F,(t), where F, is the field with g elements and t is a (transcendental) vari- 
able (a function field with positive characteristic) [AW45], [AT51] , [Wei74al, 
[CF67|, [BoCa79]. 

In every global field there is a product formula and a similar classification 
of the normalized absolute values. Many problems concerning integers have 
natural analogies in function fields. These analogies can sometimes be more 
successfully treated using methods of algebraic geometry, and they provide 
a rich source of intuition for the number field case (see §4.5, §5.2, §6.5, and 
Introductory survey to Part III). 


4.3.7 Adeles and Ideles 
The Ring of Adeles. 


In arithmetical questions the ring Z is often considered as a lattice in R, 
i.e. a discrete subgroup of the additive group of the locally compact field R 
with compact quotient group R/Z, the quotient being isomorphic to a circle. It 
turns out, that for a global field k one can canonically construct the “smallest” 
locally compact ring A,, containing & as a lattice. This means that k is a 
discrete subring in A, with compact additive quotient group A;/k. The ring 
Ax, which is called the ring of adeles is constructed using all the embeddings 
k — ky, where v runs through the set ©’ = XY, of all places of k. One defines 
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Ax to be the subring of the product [],,-5 ky consisting of all infinite vectors 
Q = (Qy)vex, Ay € ky such that a, € O, for all but a finite number of v. In 
view of §4.3.6 the number of Archimedean places does not exceed n = [k : QJ. 
Hence all but a finite number of places are non—Archimedean, and the compact 
subring O, C ky is defined to be the valuation ring of v): 


A, = (4.3.32) 


{o= too € Il ky 


ves 


a, € O, for all but a finite number of a 


One gives A; the topology generated by the open subsets of the type 


Ws =|[w. x J[[.. (4.3.33) 
ves vgs 


where S runs through all finite subsets S C 3’, and W, are open subsets in ky. 
The set Wg is compact (has compact closure) if all the W,, are bounded. Hence 
Ax is a locally compact topological ring in which k is embedded diagonally 


RD Qt ,0,0,°** yes € Ap C I» 
ves 


(note that in view of §4.3.6 |a|, = 1 for all but a finite number of uv € 2’). It is 
interesting to note that the product [],,<5 ky is too big to be locally compact: 
by definition of the product topology, the projection of any open subset U C 
ae ex ky onto ky, coincides with k, for almost all v, thus U would never be 
compact having non-compact image under a continuous map (projection). 
The above construction of Ax is called the restricted topological product of the 
topological spaces k, with respect to the compact subspaces O, defined for 
all but a finite number of indices v. The convergence of a sequence {a}°,, 
An = (Qy.n)v € Ap to B = (By) € Ax means that for any ¢ > 0 and any finite 
set S C 3’ there exist N € N such that 


1) Vn > N Wo € S Qn — By € Ov, 
2) Vn >NVWES lany — Bolu <€. 


Every principal adele a, i.e. 
A= (++, a,0,-+ Jy ERC AR (4.3.34) 


can be separated from the rest of k by a neighborhood of type (4.3.33) with 
S={ve|a ¢ Oy}. Hence & is discrete in Ay. The compactness of the 
quotient group A;/k has an explanation via the Pontryagin duality theory of 
locally compact commutative topological groups: A;,/k is isomorphic to the 
group k of all characters of k. Recall that for a locally compact group G its 
group of continuous characters 
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G = Homeontin(G, S*) (4.3.35) 


(where St = {z € C% | |z| = 1}) is again a locally compact group in the 
natural topology of the character group; one always has G4” = G, and for 


any exact sequence 
1+ G, -G-G2,-1 


with continuous homomorphisms, the dual sequence for characters is exact: 


PGs 0 = Cp 


By the association G G , finite groups remain finite; discrete groups become 
compact groups (and conversely), and for a connected group G its dual G' is 
torsion free. If H C G is a closed subgroup, then its annihilator 


Ht = {x €G@|x(H)= i} (4.3.36) 


is isomorphic to (G/HY. 

In the simplest example Z C R one has Z & S', S'* & Z, and the 
group R is self-dual: R“ © R (the number ¢ € R corresponds to the character 
(xr e?2t) so that Zt & Z. 

One can verify that the additive group Aj, is self dual, and a € A, corre- 
sponds to the character (6 + y(a3)) € Ag, where x is a non-trivial additive 
character of A; satisfying y(k) = 1, so that k & k+ = (A,/k)*. 

Consider in detail the case k = Q, and the ring A = Ag. For a = (ay)y € A 
the fractional parts {a,} are defined (for v = p one uses the p-adic expansion 
(4.3.2) to define {a,} = a_ip-' +--+ +@mp™ for m < 0). Then for all but a 
finite number of v we have that {ay} = 0, and {a} = 7,4, {av} is a rational 
number. The character y can be defined by the formula 


B+ exp(—2mi{B0.})- [] expri{G,}), (4.3.37) 
v#oo 


and for each 3 € Q* one has y(@) = 1. 
For each component v the character y, : Q* — S' is defined by 


Xv (GB) = exp(27i{G}) 


(38 € Q,), which provides the self-duality of the locally compact field Q, 
(v = p,co) in a similar way: an element t € Q, corresponds to the character 
LtH+ Xy(tx). This also gives us a description of the quotient group 


A/Q=R/Z x | [Zp, (4.3.38) 


which is easily seen by subtracting from an adele a its fractional part 
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{a} = So {a} €Q (4.3.39) 


vA~oo 


The quotient group A/Q is compact by the theorem of A.N.Tychonov on 
products of compact spaces. For a number field k it is useful to consider the 
isomorphism of topological rings 


Ay &k® Ag, (4.3.40) 


which implies an isomorphism of additive groups AG) = Gee where 
n = [k : QJ, and also statements on the discreteness of & in Ay, and on the 
compactness of the quotient group A;/k. One verifies easily that an analogous 
isomorphism takes place for an arbitrary extension of global fields F’/k: 


Ap & F @, Ag. (4.3.41) 


The Idele Group 


(cf. [Chev40], [Wei74a]). The set of all invertible elements of a ring R forms a 
multiplicative group R*. If R is topological, the topology on R* is defined by 
means of the embedding xz +> (z,2~') (R > Rx R) so that the inversion map 
x++ 2x! is continuous. The idele group Jy of a global field k is the topological 
group A; of invertible elements of the ring Ay. The group Jz coincides with 
the restricted topological product of the locally compact groups k* with respect 
to the compact subgroups O* defined for non—Archimedean places v € »’. 


4.3.8 The Geometry of Adeles and Ideles 


The embedding of & into its ring of adeles A; is reminiscent of the geometric 
interpretation of the ring of integers O = O;, of k as a lattice in the R-algebra 


ko = kOR2 [[ kh 2 Rx OC, AcO rk: (4.3.42) 


vloo 


This analogy goes much further. Consider a Haar measure ys on the locally 
compact additive group A;; this measure can be defined on the open subsets 
Ws of type (4.3.33) by 


u(Ws) = [J ro(Wo), (4.3.43) 
ves 


where [ty(O,) = 1 for v {co (i.e. for non—Archimedean v); for Archimedean 
places one normalizes the measure as follows: 


_ J|dx (Lebesgue measure) if k, = R, 
aie 2dx dy = |dz A dB| ifz=axt+iyek, SC. 
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If 6 = (B,) € Jz is an idele, then its module is defined to be the mul- 
tiplicative constant |G], by which the Haar measures p(a2) and p(Gx) on Ax 
differ: 


p(Ga) = || - w(a). (4.3.44) 


It follows from the description (4.3.43) of uw that || = [],,|G.|v, where | - |. 
is the normalized absolute value from the class of a place v € »’, which for 
Archimedean places is given by the following: 


lol |x| (the usual absolute value) if k, = R, 
tly = 
zi S22 ifz=at+iyEek, =C. 


On the compact quotient group A;/k we define a measure ys by means 
of a general notion of fundamental domain: if I" is a discrete subgroup of a 
locally compact group G, then a fundamental domain X for G modulo I is 
a complete set of coset representatives for (left) cosets G/I’, which has some 
additional measurability properties. By restricting the Haar measure a of G 
onto the subset X, one obtains a uniquely defined measure on G/I’, which is 
denoted by the same letter, and a(G/I) = a(X). 

In order to construct a fundamental domain X for A;/k we choose a Z- 
basis w1,--- ,w, of the free Abelian group O C k of algebraic integers in k. 
This is also a basis of the vector space k,, = k ® R over R, and it defines an 
isomorphism @ : R” + k,, by the formula 


O((u1,---,Un)) = Ss Ujwj. 


Denote by J the interval 0 < t < 1 in R. Then @(J”) is a fundamental paral- 
lelogram for the lattice O in kx (see 1.3). Now take X to be the set 


X=0a(I")x J] O, (4.3.45) 


v foo 


(a fundamental domain for k in A;). To prove that X is a fundamental do- 
main, we note that k, +k is dense in Ay. This statement is known as the 
approximation theorem and it is a version of the Chinese remainder theorem 
(cf. §1.1.5). Moreover, ko. x [],, Ov is an open subgroup in Ax, hence for any 
x € A, there exists 7 € k such that 


w—1 E koe x |] Ov. 


Uv 


The condition that another element 7’ € k has the same property is equivalent 
to saying that 7— 71’ € O, for all non—Archimedean places v, i.e. that 7-7! € 
O;. Thus by an appropriate choice of 7 we may assume that the y..—coordinate 
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of y = x—7n belongs to 6(I"); therefore yx. = O(u), u € I”, where u is uniquely 
determined. This establishes the statement. 

The first application of the measure constructed on A;,/k is a simple proof 
of the product formula (4.3.31): if 6 € k* C J, is a principal idele, then 
Bk* = k* in Jy, and multiplication by 3 defines a homeomorphism of A,/k 
with itself, hence the Haar measures u(x) and p(Gx) on A;,/k must coincide, 
i.e. by (4.3.44) we see that 


= — WCBAR) 
|| = ie [Bolo = cer 1 


Let us calculate the measure u(A;/k). The form of the fundamental do- 
main X constructed reduces this calculation to the problem of determining 
the volume of the fundamental parallelogram 0(J”) in ko. This volume was 
already found in §4.1.3, (4.1.7). We obtain 


(Ag/k) = |Dz|'”?, (4.3.46) 


where D, = det(Tr(wjw;)) is the discriminant of k. Here we have taken into 
account that the measure sie Ly On ky differs by a multiple of 2 from the 
Lebesgue measure on those components v such that k, = C, when 


d pty (z) = 2dax dy = |dz A d2| for z=x+iyEek, SC. 


Consider the constant 
oA" 
C= (2) Vv |Dg|- (4.3.47) 
This number is important for finding non-zero points ( in the lattice k C Ax 
belonging to a parallelotope, i.e. to a set of the form 
Vic) = {x = (ay)y © Ap | Vu € Lp [Lulu < co}, (4.3.48) 


such that c = (cy) is an infinite tuple of positive constants defined for the 
places v of k, all but a finite number of which are 1. 


Lemma 4.19 (Blichfeldt). Assume that for the numbers cy, we have that 
oe 
[;eese= (2) J/|Del- 


Then there exist 8B € k* 1 V(c) C Ag. 

Proof. Consider the auxiliary parallelotope V(c’), c’ = (c,)v, v € ©’, where 
C= "Cy; if vis non-Archimedean 
/ 


) — ¢,/2, if ky YR 
Cl, = Cy/A4, ifz=xt+iyek, =C. 


Cc 
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Then one can calculate the measure of V(c’): 


(Ve) = (5) [Io > vial 


Uv 


In other words, the measure of V(c’) is bigger than that of the fundamental 
domain for A;,/k, hence there exist two distinct points y and y’ € V(c’) whose 
images modulo k coincide, i.e. y—y’ € k*. We obtain for the number 3 = y—y/ 
the following estimates: 


max(|Yolv, [yi lv) < cy if v is non-Archimedean, 
[Blu < 2max(|yu|v,|Yolo) Sc» ifk, =R, 
Amax(|yo |v, |¥%|v) if z=a2+iy ek, &C, 


proving the lemma. 


We now turn our attention to the structure of the idele group. Consider the 
homomorphism |: |, : J, —- Ri, which takes y = (yy), € Jz to |y| = J], Yolo: 
Denote by Jj its kernel, then J; is a closed subgroup, and in view of the 
product formula (3.27) we have that k* C Jj. The following theorem is one 
of the most important facts in algebraic number theory. 


Theorem 4.20. The quotient group Ji /k* is compact. 


The proof relies on Blichfeldt’s lemma, and is very similar to the proof of 
Dirichlet’s unit theorem, and the deduction of the latter from Minkowski’s 
lemma. One can show that this theorem is equivalent to the conjunction of 
Dirichlet’s unit theorem and the finiteness of the ideal class group (see §4.1.6 
and §4.2.2). These two statements can be easily deduced from the above the- 
orem as follows: 


The Divisor Map. Let I, be the group of fractional ideals (divisors), i.e. 
the free Abelian group generated by the set of non—Archimedean places of k. 
Define 


div: Jp —- Ix, div((ay)) = S- U(Ly) + U, (4.3.49) 


vio 


where v denotes as agreed above the valuation of k normalized by the condition 
u(k*) = Z. Note that div(J;,) = J, and that changing only the Archimedean 
component Xoo = (Ly)yJoo Of an idele x does not change div(«). Note also 
that div(k*) = P, is the subgroup of principal ideals in the discrete group 
I;,. Hence we have a continuous epimorphism div : Jg/k* — Iy/Py = Cly 
of a compact group onto a discrete group. The image is both compact and 
discrete, and is therefore finite. 


The Logarithmic Map and S-Units. Let S Cc » be a finite set of places 
containing the set 37. of all Archimedean places. The set of elements 7 € k” 
satisfying |7|, = 1 for all v ¢ S forms a multiplicative group, which is denoted 
by Eg and is called the group of S—units. 
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Theorem 4.21 (Theorem on S—Units). The group Eg is the direct sum of 
a finite cyclic group and a free Abelian group of rank s—1, where s = Card S$ 
is the number of places in S. 


(cf. [La70]). The proof of this theorem is similar to that of Dirichlet’s unit 
theorem (see §4.1.4). One considers the logarithmic map 


1: J, —RO---@R 
SS (4.3.50) 


s times 


(where R is the additive group of real numbers), defined by 


I((xv)v) — (- -+ log [zulu a -Jues: 


This map is continuous, and its image contains a basis of the vector space R* 
(if S = X,, then / is an epimorphism). 

With the help of (4.3.49) and (4.3.50) it is not difficult to describe fun- 
damental domains for k¥ in J, and Jj (cf. [Wei74a], pp.137-139]). One can 
calculate the volume 4(Ji/k*) with respect to the Haar measure 7 on Jz /k*. 
We normalize the measure Y by using the decomposition: 


d 
Tn [kX % IE /RX x RX, y= (; x = (4.3.51) 


in which 
= II You 
Vv 
is the Haar measure on Jz, normalized as follows: 


W(OX) =1 if v is non-Archimedean, 
dyy(x) = |x|~dx ifk, = R, 
dy,(z) = |zz|""|dz Adz|=2dxdy ifz=ax+iyeky =C. 


Then the following formula holds: 


Qn 
x( 71 X\one = 
V(J,/k*) = Qn AR, W, = Kr, (4.3.52) 
where h = |Cl,| is the class number of k; R, is the regulator, and w = wy is 
the number of roots of unity in k, see 4.1.3. This formula means that for any 
positive number m > 1 in R the subset C(m) of J;,./k* defined by C(m) = 
{a € Jp/k* | 1 < |x| < m} has measure 


7(C(m)) = Kx log m. (4.3.53) 


The quantities R = R,, h = hy, and D = D, turn out to be the most 
important constants characterizing a number field k. These quantities occur 
together in formulae (4.3.52) and (4.3.53) for the volumes of fundamental 
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domains, and are not independent. According to a deep result of Brauer and 
Siegel (cf. [La70], [La83]), one knows that for a sequence of number fields 
km of degrees Nm = [km : Q| satisfying the condition n,,/log|Dz,,| — 0 as 
m — oo, the following asymptotic relation holds 


log(hk,, « Rr,,) ~ log(|Dz,,|)/?. (4.3.54) 


The idele class group 
Cea 


plays a key role in classifying the Abelian extensions of k (class field theory), 
cf. 84.4. 
If k = Q then there are isomorphisms 


Jq/Q* = RX x [[ Zz, (4.3.55) 
Pp 


Jo/Q* = |] ZX, (4.3.56) 
Pp 


which are easily established by dividing an idele a € Jg by its (multiplicative) 
divisor div(@) = ie p’?(%) which in this situation turns out to be a positive 
rational number. As a result one obtains the element a - sign(a..) - div(a)~, 
which belongs to the right hand side of (4.3.55). 


4.4 Class Field Theory 
4.4.1 Abelian Extensions of the Field of Rational Numbers 


(cf. [AT51], [Chev40], [Wei74a]). One of the central objects of algebraic number 
theory is the full Galois group G = G(Q/Q) of Q over Q, together with its 
subgroups H C G of finite index, which correspond to finite extensions k of 
Q: = 

H = G, = G(Q/k) CG. 


From the topological point of view G is a compact, totally disconnected group, 
with the topology of a profinite group (the projective limit of its finite quotient 
groups): 
G = limG@/G; = lim G(k/Q), 
k k 


where Gy, are normal subgroups which are both closed and open, as they 
correspond to finite Galois extensions k/Q. 

Class field theory provides a purely arithmetical description of the maximal 
Abelian (Hausdorff) quotient group G?? = G;,/G¢, where G¢ is the closure 
of the commutator subgroup of G,. Moreover, one has this description both 
for algebraic number fields and for function fields (global fields of positive 
characteristic). One form of this description of Ge is given by a calculation 
of all characters (one-dimensional complex representations) of the full Galois 
group Gx. 

The topological structure of infinite Galois groups is similar to that of lo- 
cally compact analytic Lie groups over p-adic fields such as SLn(Q,), Sp, (Qp) 
etc. The use of analytic methods such as the representation theory of Lie 
groups and Lie algebras, has developed drastically in recent decades. These 
techniques are related to non—commutative generalizations of class field theory 
(see §6.5). We first describe the group Ge starting from the Kronecker—Weber 
theorem, which says that every Abelian extension k of Q (i.e. an extension 
whose Galois group G(k/Q) is Abelian) is contained in a cyclotomic field 
Km = Q(Gn), where ¢,, is a primitive root of unity of degree m (see §4.1.2). 
There is an isomorphism 


Wm i (Z/mZ)* — Gm = G(Kn/Q, (4.4.1) 
which associates to a residue class a (mod m) € (Z/mZ)”*, (a,m) = 1 an 
automorphism ¢ = 04 = Wm(a) € Gm given by the condition ¢7 = ¢%. 


The arithmetical isomorphism (4.4.1) makes it possible to regard Dirichlet 
characters x : (Z/mZ)* — C* as one-dimensional representations 


dy Gn (ZL /mZ)* 2aC%, 4.4.2 
x 


where GG, is the natural homomorphism restricting the action of the 
Galois automorphisms to the subfield Km; py = x ° Wm © Tm. Hence each 
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character p : G > C% has the form p = p, for some x. For example a quadratic 
extension k = Q(Vd) is contained in the cyclotomic extension Q(¢)p)), where 
D is the discriminant of k. This is easily shown using Gauss sums: for the 
quadratic character x = x, of k we have G(x)? = D. Hence G(y) = +VD € 
K\p). Recall that x is a primitive quadratic character modulo |D| which is 
uniquely determined by the condition y(—1) = sign D. The field k corresponds 
by Galois theory to the subgroup Kerp C G(K\p)/Q) of index 2, p = py,. 

By the Kronecker-Weber theorem, the maximal Abelian extension Q?? 
can be described as the union of all K,,, and its Galois group coincides with 
the projective limit of the groups G,, = (Z/mZ)*, that is 


G®? = lim(Z/mZ)*, 


where the limit is taken over the system of natural projection homomorphisms 
(Z/mZ)* — (Z/m2Z)* 


for m2 dividing m,. Hence the group G?? coincides with the group ice Zp of 
invertible elements of the ring L= II p Lp (the profinite completion of the ring 
of integers). 

A more invariant formulation of this isomorphism is based on the intro- 
duction of the ring of adeles A and its multiplicative group J = A”, the ideles 
of Q (see §4.3.7 and §4.3.8). The group J consists of all infinite vectors 


A = (O90; 12, 3,-+--,Qp,-..) € R* x [][ Qo, 
Pp 


such that a, € Z> for all but a finite number of p. The quotient A* /U, is 
discrete. According to (4.3.55) we have 


J/Q* = RX x [] ZF, 
Pp 


where R* is the multiplicative group of all positive real numbers. 
The group G?? is therefore isomorphic to the quotient of J/Q* by the 
connected component of 1: 


Ge =] [ZX = J/REQ. (4.4.3) 
Pp 


The important feature of this isomorphism is that the elements of G,,, and 
hence of G?>, have an arithmetical nature; they correspond to prime numbers. 
Namely, a prime p not dividing m corresponds to its Frobenius element 0 = 
Op: Gm GP. The set of all primes corresponding to a fixed element o € G'y, is 
infinite by Dirichlet’s theorem on primes in arithmetical progressions. This set 
coincides with the set of primes of type p = a+km (k € Z), where o = tm(a). 


4.4 Class Field Theory 157 


The automorphism ¢ is called the Frobenius automorphism (and denoted Fr, 
or Frob,) for the following reason: if we consider the ring Om = Z[Gn] of all 
integers in K,,, then in the reduction O,,/pOm we have Fr,(#) = 2”, i.e. Fry 
acts as the Frobenius automorphism. The way that p splits into prime ideals in 
O,», depends only on the image of p in the Galois group Gy, = (Z/mZ)”* (see 
84.1.2). The idea of associating a Galois automorphism to a prime number (or 
prime ideal) leads to the isomorphism (4.4.3), in which to Fr, one associates 
the class of the idele 


T = (1;1,--- , 1, p,1,++) in J/(Ry x Q*). 


The field K,, corresponds to the open subgroup 


Um =R% x [[G+mZ,)* x [] ZX cu, 


p|m ptm 


so that Gm = J/UmQ* , [La73/87]. This formulation of the result is very easy 
to extend to the general case of the group G?> for arbitrary global fields k. 
Note that the set of all primitive Dirichlet characters can be identified with 
the discrete group of all characters of finite order of the idele class group Cz 
(k = Q) using the projection 


J/Q* = Ry x [Zz ee oe 
Pp 


Such characters are all trivial on the connected component of the identity. 
Abelian extensions of Q correspond bijectively to open subgroups of J/R*Q*, 
and any such group is the intersection of the kernels of a finite number of 
Dirichlet characters. 


4.4.2 Frobenius Automorphisms of Number Fields and Artin’s 
Reciprocity Map 


Let K be an algebraic number field, [K : Q| = n, Y’?% the set of all finite 
places of K (normalized discrete valuations which correspond to prime ideals 
pv 4 0 in the ring of integers Ox of K); 


Po = {a € Ox | |a|, < 1}. 


The residue field k(v) = Ox /py is finite, having Nv = p?°8” elements, where 
Py = Char k(v) is the characteristic and degu = f, is the degree of the 
extension (or inertial degree) of k(v) over F,,. The absolute value is normalized 
by the condition 


v(x) = —logyy [tly (|alv = No). (4.4.4) 


The ramification index e, of v is the number v(p,). With this notation one 
has the following decomposition pOK = [[,, 4(p)30 Pv” 
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Let L/K be a finite Galois extension with Galois group G(L/K), and let 
w be a place of L, which extends a fixed place v of K. Define the action of 
the group G(L/K) on the places w € XX; by wh ow, 


= 
[Clow = |e |w- 


If v and w are non—Archimedean, p, and ‘8, being the corresponding prime 
ideals, then ow corresponds to the ideal Bou = PB?,. A Galois automorphism 
o € G(L/K) induces an isomorphism of the completions ¢ : L, > Low as 
normed vector spaces over Ky. 

The decomposition group Gy is introduced as the subgroup 


Gy ={o € G(L/K) | cw=w} C G(L/K). (4.4.5) 
By definition we have that 
Gry = {6 € G(L/K) | orw = Tw} =7Gyrt. 


On the other hand, it is immediate from the explicit construction of the ex- 
tensions of places, that G(L/K) acts transitively on the set of places of L 
lying over a fixed place v of K. Hence all the corresponding subgroups G',, are 
conjugate [Wei74al. 

The inertia group Iy C Gy is by definition the kernel of the natural 
homomorphism G,, = G(Lwy/Ky) — G(i(w)/k(v)) where I(w) denotes the 
residue field of the place w. The quotient group Gy/Iy = G(l(w)/k(v)) is 
generated by the Frobenius automorphism: G(I(w)/k(v)) = (Fru), Fre (x) = 
xN”. The place w is called unramified iff [,, = {1}; in this case one has 
Gy = (Fr). It follows from the definitions that Fr;,, = T~'FryT, so that the 
conjugacy class of Fr, in G(L/K), if defined, can depend only on v. It turns 
out that all but a finite number of places are unramified; for such places we 
put 


FrK(v) = (the conjugacy class of Fr, for wv). (4.4.6) 


If G(L/K) is commutative, then the right hand side of (4.4.6) consists of one 
element. 

The Artin reciprocity law tells us where the Frobenius elements Fy,/% are 
situated in a commutative Galois group G(L/K). Let S be a finite set of 
places of K, including all Archimedean places and those places ramified in 
the extension L/K. Denote by I°% the free Abelian (multiplicative) group 
generated by the elements p, for v ¢ S. Then the association v +> F,/«(v) € 
G(L/K) extends to a homomorphism 


Frj« 1 1° — G(L/K), (4.4.7) 


which is called Artin’s reciprocity map, 


4.4 Class Field Theory 159 


Fri II pe | = II Frjg(v)™. (4.4.8) 
ves ves 


Class field theory gives an explicit description of the kernel of the Artin 
reciprocity map (4.4.7) (see section §4.4.5 below). The statement that (4.4.7) 
is surjective was established first, and it could be deduced from the general 
Chebotarev density theorem, which is a far-reaching generalization of Dirich- 
let’s theorem on primes in arithmetical progressions (cf. [Chebo25], [Se70], 
[Se68a], [Chev51]). 

Let P be a subset of the set 2% of all non—-Archimedean places of K. For 
any integer x > 1 denote by a,(P) the number of places v € P such that 
Nv < a. We say that P has density a > 0 if the limit exists 

a, (P) 


li =a. 4.4. 
im a,(59) a (4.4.9) 


Not every set of places has a density. For example, if K = Q and P is the set 
of primes whose first digit is equal to 1, then P does not have a density. 

By the prime number theorem one has a,(279-) ~ x/log x, hence the con- 
dition (4.4.9) is equivalent to the following asymptotic expression 


x x 
al P)= ; 4.4.1 
ax(P) “Tog 2 +o(c=) ( y 


4.4.3 The Chebotarev Density Theorem 


Theorem 4.22. Let L/K be a finite extension of a number field K, and X 
a subset of G(L/K), invariant under conjugation. Denote by Px the set of 
places v € XY unramified in L such that the classes of Frobenius elements 
of these places belong to X: Frj% (Px) C X. Then the set Px has a density, 
which is equal to Card X/Card G(L/K). 


The proof is based on analytic methods; the notion of the analytic density 
of P is introduced as the limit 


lim Liver Nv* 
sl+ 1 

log (4) 
Proving the existence of and calculating this limit for P = Px can be done 


with the help of the Artin L-functions (see §6.2.2); the density statement in 
the above sense (4.4.9) can then be deduced (cf. [Chev40], [La70]). 


(4.4.11) 


4.4.4 The Decomposition Law and 
the Artin Reciprocity Map 


If L/K is an Abelian extension, then the decomposition of p, in Oy is com- 
pletely determined by the order f of the element Fy/«(v) € G(L/K),: in this 
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case py = Bu, --++: Bw,, where s = (G(L/K) : (o)) and f = f(w;i/v) = 
deg w;/degv = [I(w;) : k(v)] is the relative residue field degree. This fact is 
deduced from the transitivity of the action of the Galois group G(L/K) on 
the set of places w dividing v. In particular, the place v splits completely (i.e. 
f =1 and v is unramified) iff Fr;~(v) =1 € G(L/K). 

Theorem 4.22 shows that a finite Galois extension G(L/K) is uniquely 
determined (in a fixed algebraic closure K’) by the set Spl, /K« Of places which 
split completely in L/K. The Artin reciprocity law gives us amongst other 
things a description of this set when L/K is Abelian. For non—Abelian exten- 
sions there are only some special cases when Spl; ;;, is known. However these 
examples provide a basis for quite general conjectures (the Langlands program, 
see §6.5. These conjectures determine nowadays one of the main directions in 
modern algebraic number theory. 


4.4.5 The Kernel of the Reciprocity Map 


In order to formulate the main result on the kernel of the reciprocity map 
(4.4.7) we recall that the relative norm N;/«(w) of a non—Archimedean place 


w/v) ( 


w is defined as pi or, in additive terms, as f(w/v)-v), where 


f(w/v) = deg w/ deg vu = [I(w) : k(v)] = logy, Nw 


is the relative degree of residue fields. Also, consider the divisor map (see 
(4.3.49)) 


divg: KX > I°, divg(a) = II pr) er. 
ves 
where S' is the set of all Archimedean places and places ramified in L/K. 


Let L/K be an Abelian extension of K, f = JI, pi” an ideal in Ox, 
divisible by sufficiently high powers r(v) of the prime ideals ramified in L. For 
each Archimedean place v € X’% we fix an embedding 


oeavVec. Ka Kyc C, 
which induces v, and let 
wn = {ve LR | kK. =R, Ly =C for wiv}. 
Define the subgroups Pr /x(f), Ntr/K(f) C I° by 


Pryx(f) = {divs(a) | a€K*, a=1mod f, We LM _ a” > 0}. 
(4.4.12) 


NrK (Ff) = (Nz/K(w) wes; (4.4.13) 


the latter being the subgroup generated by the relative norms of prime divisors 
of those places v (or ideals p,) which are unramified in L/K. 
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Theorem 4.23 (The Artin Reciprocity Law). Let L/K be an Abelian 
extension. Then 


KerF yx = Prjx(f) - MxK(f)- (4.4.14) 


Corollary 4.24 (Description of the Galois group). For an Abelian ez- 
tension L/K the reciprocity map (4.4.7) induces an isomorphism 


G(L/K) = 1°/(Prx (f) Nr (f))- 


4.4.6 The Artin Symbol 
Consider the group Jx of ideles, and define a surjective homomorphism 
(.,L/K): Jn 2 G(L/K), s+ (s,L/K) (4.4.15) 


with the help of the reciprocity map (4.4.7). For an arbitrary s € Jx let us 
choose a principal idele a € K™* such that las, — 1|, < ¢ for v € S and 
sufficiently small ¢ > 0. Define the S—divisor (cf. (4.3.49)) by 


div(as) = [se er. 
Then the Artin symbol (s,L/K) = Wr/K(s) is defined by the formula 


(s, L/K) = vr/x(s)= Fr x (div(as)). (4.4.16) 


We stress that (4.4.16) is defined in terms of ideles, and in order to show 
that (4.4.16) is well defined it is essential that the reciprocity law in terms of 
ideals (4.4.14) is satisfied. Indeed, the condition on a in (4.4.16) is satisfied 
if div(a) € Pr/«(f) with an appropriate choice of jf. Now the reciprocity 
law transforms into the statement that Ker, /, coincides with K*Nz/x«Jz, 
where N;,/« Jz is the subgroup of relative norms of ideles from Jr: 


Nzx((Bw)w) = | [[Nzux. (Bw) | - (4.4.17) 


wiv 
v 


Hence the Artin symbol wy, in (4.4.16) is defined for idele classes s € Cx = 
Jx/K*. Furthermore Fy/«(v) = o(s(v)), where s(v) is the idele class of 
(---,1,m,1,---) for a local uniformizer 7, € KX, i.e. an element with the 
condition v(m,) = 1. The homomorphism wp/% : CK — G(L/K) is continuous 
and its kernel is both open and closed, again in view of (4.4.14). 
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4.4.7 Global Properties of the Artin Symbol 


Let H be a subgroup of a finite group G. Then the transfer homomorphism 
(or Verlagerung) 


Ver : G/[G, G] — H/[H, H], (4.4.18) 


is defined by Ver(g[G,G]) = ]],crh(g.r) where r runs through a system 
of representatives R of left cosets G/H and h(g,r) € H is defined by the 
condition gr = grh(g,r) (gr € R being the representative of gr in R). 


1) There is a one-to-one correspondence between open subgroups U C Cx 
and finite Abelian extensions L/K, such that the symbol (4.4.16) induces 
an isomorphism 

C/U > G(L/K), 
and U coincides with the norm subgroup U = Nzx (Cz) (see (4.4.17)). 

2) Let K’/K be an arbitrary finite extension. Then for a € Cx: the following 

equation holds 


(Nic/x(a), L/K) = (a, LK’/R"). (4.4.19) 


3) Let L’/K be a finite Galois extension, L/K the maximal Abelian subex- 
tension of L’/K, and K’ a subextension of L’/K over which L’ is Abelian. 
Then 


(a, L'/K’) = Ver(a, L/K), (4.4.20) 


where Ver is the transfer (4.4.18). 
4) Let L’/K be a finite Galois subextension of L/K. Then for all a € Cx 
the following equation holds 


(a, L'/K) = (a, L/K). (4.4.21) 


5) Let o be an isomorphism of K onto cK, o € Aut K. Then for all a € Cx 
the equation holds 


(ca,0L/cK) = o(a,L/K)o~'. (4.4.22) 


The bar in the above formulae denotes the restriction to a subfield (cf. 
[Chev40], [Koch70], [AT51], [Wei74a]). 

These properties make it possible to extend the definition of the Artin 
symbol to infinite Abelian extensions L/K. Consider the correspondence 


s+ (s,L/K) =lim(s, L,/K), (4.4.23) 


where L,/K runs through all finite subextensions of L/K. It follows from 4) 
that this is well defined and one has a map from Cx to G(L/K) with dense 
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image. Taking into account the one-to-one correspondence between subgroups 
of finite index of Cx and G(K*?/K), we see that G(K*?/K) is isomorphic 
to the profinite completion of Cx, which is in fact with the quotient of Cx 
by its connected component. The thus constructed reciprocity homomorphism 
satisfies the properties 2), 4) and 5). 


Note that a new approach to the class field theory was developped by 
J.Neukirch in Chapters 4-6 of [Neuk99]. 

From a profinite group G endowed with a surjective continuous homomor- 
phism d: G > Z and a G-module A endowed with a “Henselian valuation 
with respect to d’, one constructs in an elementary way the reciprocity homo- 
morphisms; if A satisfies the so-called class field axiom (a statement involving 
zeroth and (—1)st cohomology of A), then the reciprocity homomorphisms are 
isomorphisms. From this abstract class field theory, both local and global class 
field theory are deduced, and the classical formulation of global class field the- 
ory, using ray class groups instead of the idéle class group, is presented as well. 
Neukirch’s approach minimizes the cohomological tools needed to construct 
class field theory. 


4.4.8 A Link Between the Artin Symbol and Local Symbols 


Suppose that we already know the existence of the Artin symbol on ideles 
(4.4.16). For a finite Abelian extension L/K, a non—Archimedean place v of 
K and an extension w of v to L, consider the completions K, and L,,, and 
the decomposition group 


Gy Cc G(L/K), G, = G(Ly/Ky), 


which in the Abelian case does not depend on the choice of w. Consider 
the embedding i, : K7* << Jx, and the projection onto the v-component 
Jy : J~ — Kx, where i, maps x € K* onto the element of Jx, whose 
v-component is equal to x, and whose other components are all 1. Put 


ty = WL/K Oly = (-, Lw/Ky)y- (4.4.24) 


Then one verifies that the image of 7, belongs to the decomposition group Gy. 
The homomorphism yw, : KX — Gy is called the local Artin homomorphism 
(or the norm residue homomorphism). If x = (a,) € Jx, then the following 
decomposition holds 


br/x(2) = |] do(a), (4.4.25) 


where 


L= lim (1 ie) 


ves 
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(the limit is taken over an increasing family of places of K). The product 
(4.4.25) is actually finite: if a component x, is a v-unit and v is unramified, 
then x, is a norm in the extension L,,/K,: for some yy € Ly one has ry = 
Ny,,/K,Yw- The existence of y, is established by Hensel’s lemma (see 4.3.2). 

Thus the knowledge of all local Artin maps wy is equivalent to the knowl- 
edge of the global Artin map y/K. In classical work on class field theory 
the local reciprocity maps were studied using the global theory; in particu- 
lar, it was shown that these local maps depend only on the local extensions 
Ly/Ky, and are independent on a global extension L/K from which they are 
obtained. In this sense, modern expositions of class field theory (for example, 
in [Chev40] , [Wei74b]) differ from classical texts: first one gives a purely local 
and independent construction of maps 


by KX = G = GUL" /K,), (4.4.26) 


where L” is a finite extension of K,. Then one proves that the product [],, 4. 
has the properties which uniquely characterize the homomorphism wz,/%. The 
most important part of the proof consists of verifying the product formula 


6,(a) = 1 for alla € K%. (4.4.27) 


In the case of a quadratic extension L = K(Vb) the image 0,(a) belongs to 
{£1} = G(L/K), and coincides with the Hilbert symbol, defined in 84.3.3. 
The product formula is equivalent to the quadratic reciprocity law of Gauss, 
which thus becomes a special case of the general reciprocity law (4.4.14). 

The construction of the map (4.4.26) for an arbitrary Abelian extension 
L’/K, is usually carried out using methods of Galois cohomology theory (see 
§4.5, and [Se63], [Se64], [Chev40], [Koch70] [Koch97]). A more direct construc- 
tion of 0, was suggested by [Haz78] , [Iw86], based on an explicit analysis of 
cohomological constructions in low dimensions, compare with the approach of 
J.Neukirch cf. [Neuk99]). 


4.4.9 Properties of the Local Symbol 
The properties of the local symbol 

Oy rz Ww = (-, Lw/ Kv) : Ke => Gy 
are completely analogous to the corresponding properties 1) to 5) from §4.4.7, 
replacing Ck = JK/K* by KX, G(L/K) by Gy and wr/K by Oy. Also, the 
homomorphism @,, maps the group of units U, = O* of K, onto the inertia 
group I” Cc G,. If L,/Ky is unramified, then for all a € K> one has 

O(a) = Fro) 


where Fr, € Gy is the Frobenius element of the extension, and the valuation 
v of Ky, is normalized by the condition v(K7‘) = Z. In the same way as for 
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wx/K the local symbol can be generalized for infinite Abelian extensions, and 
thus one obtains a reciprocity map 6, : KX — G(K2/K,), where K2? is the 
maximal Abelian extension of K,,. The Galois group can then be described as 
follows: 


G(K3"/K,) = (KX) 2 Z x OF (4.4.28) 


where ’\ denotes the profinite completion. Under the isomorphism (4.4.28) the 
Galois group G(K?"/K,) = G(F,/F,) of the maximal unramified extension 
Kk} of Ky, becomes Z, and the inertia group I, = I” maps isomorphically 
onto the whole group of units O*: 


0, : OX SI” (4.4.29) 
(the field AK?* can be defined as the maximal extension of K, for which the 
extension of the valuation v satisfies the property v( kK‘) = Z). 
Below we give a remarkable explicit construction of the maximal Abelian 
extension K2> of a local non—Archimedean field K,, generalizing the contruc- 
tion of Qe by adjoining roots of unity to Q,. 


4.4.10 An Explicit Construction of Abelian Extensions of a Local 
Field, and a Calculation of the Local Symbol 


(cf. [LT65], [Se63], [Chev40], [Ha50], [Sha50], [CW77], [Koly79]). Consider 
first the field Q, as a model example. Any Abelian extension of this field 
is contained in a cyclotomic extension, ie. Q3> = Q,(Wo), where Wo = 
UnsiWn, Wn = {6 € Q, | ¢? = 1}, We is the set of all roots of unity from 
Q,. Let Wpx = Um>oWpm be the subset of all roots of unity of p—power order, 
and Vx. = Up ynWn the subset of roots of unity of order not divisible by p. 
Then 


Woo = Voo X Wpm, Qp(Woo) 7 Qp(Voo) -Q,(Wp-~), 
and the following decomposition takes place: 
G(QE?/Qp) = G(Qp(Vo0)/Qp) x G(Qp(Wpx)/Qp). (4.4.30) 


Here Q,(Vs0) is the maximal unramified extension (see the example from 
4.3.4), for which vp,(Qp(Voo.)*) = Z and 


G(Qp(Voo)/Qp) = G(Fp/Fp) 2Z= (Fry). (4.4.31) 


The field generated by Wy = Um>oWpm is the union of all the totally rami- 
fied extensions of Q,. The Galois group G(Q,(Wp~ )/Q,) can be described by 
means of its action on the set W,-~ of all roots of unity of p-power order. In 
order to do this we note the isomorphisms 


End Wh — Zp, Aut Wh — Ze 
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in which a p-adic number a = ao + aip + agp* + --- € Z, in its digital form 
(4.3.2) corresponds to the endomorphism [a] : ¢ + ¢° for ¢ € Wp: 


ce def Coote rptaap’t-tom—1p if CE Wm e Woe. 
From the action of the Galois group on W,-~. one obtains a homomorphism 
bp : G(Qy(Wpx )/Q,) + Aut Wyo & Ze (4.4.32) 


which is a one-dimensional p-adic Galois representation (the cyclotomic rep- 
resentation), and (4.4.32) is an isomorphism. 

It turns out that the local symbol 6,(a) = (a, Q3°/Q,) € G(Q/Q,) for 
an element a = p™u (m€ Z, ue Zr) can be described using isomorphisms 
(4.4.31), (4.4.32): 


8,(a) Fr,” on the subfield Q,(Vo.), 
Qa) = 
ss [u-"] on the subfield Q,(W,<). 


We now reformulate this in a manner more suitable for generalization. 
Consider the sets Epo = Um>oEpm, where 


Epm ={w=C-1|¢ € W~}. 
These sets are groups with respect to the group law 
W100 W2:=wWitwetwiwe (wi,we € Ene), 
and for all w € E,~ one has |w|, < 1. The set E, consists of all roots of the 
polynomial 


fy(X) = (X41) —1= px + (B) X24 tx? 


which becomes irreducible after division by X according to Eisenstein’s irre- 
ducibility criterion. Its roots therefore generate a field Q,(E,) of degree p— 1 


over Q,. 
Now consider the iterations of the polynomial f,(X): 


fr (X) = fol fr(X)) = ((X +1)? — DP — 1, 


fom (X) = fom (fo(X))- 


The group E,m coincides with the set of all roots of the polynomial fpm(X), 
and this is isomorphic to p-™Z,/Zp. Under this isomorphism the obvi- 
ous inclusions Lym C Epym+1 become the natural embeddings p~™Zp/Zp C 
p-™'Z,/Zp, and we see that E,« ~ Q,/Z,. From this it follows that 
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End Ex. = Zp. We have Q,(E..) = Qp(Wp~), and the isomorphism (4.4.32) 
takes the form 


5p : G(Qp(Ep=)/Qp) Aut Epa & ZX. (4.4.33) 


Now let K, be an arbitrary finite expension of Q, with valuation ring O,, 
maximal ideal p, = (7) and q = |O,/p,|. Here z is a uniformizing element, 
i.e. v(7) = 1. There is an analogous construction of the maximal Abelian 
extension of K,. Consider the polynomial 


En(X) = 1X + X4. (4.4.34) 


It follows as before by Eisenstein’s criterium that f,(X)/X is irreducible. 
Define recursively the iterations 


Fam (X) = frm-i(fr(X)), m2 1. 
Then the sets of roots 
Wem = {x € Ky | frm(z) = 0} (4.4.35) 
of the polynomials f,;m(X) form an increasing sequence: 
Wem C Wem+1; 


and there is a natural group structure on (4.4.35) such that Wy is isomor- 
phic to pp>™/O, (= O,/p%"). The inclusions Wy m C Wym+1 become the 
natural embeddings py” /O, C p,™—'/O,. Thus we obtain a group, which is 
analogous to the group of all roots of unity of p-power order: 


Wyoo = LJ Whim is isomorphic to Ky/Oy. (4.4.36) 


m>1 


There is a natural action of elements a € O, = End(K,/O,) on Wy,m for 
which the equation [z];(z) = fx(x) holds. This action will be denoted by 
[a] : x + [a] xv. The action of the Galois group on the roots of the polynomials 
fazm(X) provides us with a representation analogous to (4.4.32): 


G(Ky/Ky) > Aut Wyo = Or. (4.4.37) 


Denote by K, the field which corresponds to the kernel of the homomorphism 
(4.4.37). Then AK, is an Abelian extension of kK, in view of the isomorphism 


5y : G(Ka/Ky) % OX, (4.4.38) 


and we obtain the following explicit description of the Abelian extensions of 
Ky: 
PS 2 i 
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where K}" = K,(Voo) is the maximal unramified extension of Ky, (Voo is the 
group of all roots of unity of degree not divisible by p), 


G(K""/K,) = G(F,/F,) = Z = (Frq)". (4.4.39) 


The field kK, = U KoWym) is the union of all totally ramified Abelian 


extensions of K,. 
The norm residue symbols can then be described as follows: 


1) for u € O* the element 6,(u) = (u, K,/Ky)y acts on Wy oo via [u7"] f; 

2) the norm residue symbol of (a, K,/Ky)y is equal to 1. 

3) the symbol @,(@) for a = 1™ (m € Z, u € OF) acts on Ky" as Fro’ € 
G(Ky"/Ky). 


A remarkable feature of the construction of the group law on the set Woo 
is that the field K, is independent of the choice of uniformizer 7 and of the 
polynomial f(X) € O,[X], which need only satisfy the following requirements: 


f(X) = 7X (modulo degree 2 polynomials), (4.4.40) 


f(X) = X4%(mod 7). (4.4.41) 


Moreover, instead of a polynomial f(X) one may use any element of the set 
F,, of power series f(X) € O,[[X]] satisfying the above conditions (4.4.40), 
(4.4.41). 

The above group law is constructed in the theory of Lubin—Tate formal 
groups. 


4.4.11 Abelian Extensions of Number Fields 


For the field of rational numbers Q the theorem of Kronecker—Weber (see 
§4.1.2) gives an explicit description of all Abelian extensions with the help 
of the action of the Galois group on roots of unity, which may be regarded 
as certain special values of the exponential function: ¢,, = exp(2mi/m). An 
analogous theory exists also over an imaginary quadratic field K = Q(Vd), 
whose Abelian extensions are constructed with the help of the action of the 
Galois group G(k/K) on the points of finite order of an elliptic curve with 
complex multiplication (more precisely, on the coordinates of these points, 
see §5.4 of Chapter 5). This description is essentially the content of the theory 
of complex multiplication. In more classical terms, Abelian extensions of an 
imaginary quadratic field are described by means of the special values of 
elliptic functions and the j—invariants corresponding to lattices with complex 
multiplication. The Galois action on these values is explicitly described in 
terms of the arithmetic of the imaginary quadratic ground field (this was 
Kronecker’s “Jugendtraum” (“dream of youth”), cf. a nice book by S.Vladut, 
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[Vla91]) The content of Hilbert’s famous twelfth problem is to give an explicit 
description of all Abelian extensions of an arbitrary number field K, [K : Q| < 
oo using special values of certain special functions (such as the exponential 
function or elliptic functions), and by means of the Galois actions on these 
values. 

Some progress has been made in solving this problem for the so-called 
CM-fields K. These are totally imaginary quadratic extensions K = F'(./—a) 
of totally real fields F’: F is a number field generated by a root of a polyomial 
which splits as a product of linear factors over R, and a € F is totally posi- 
tive (positive in each real embedding of F'). This multi-dimensional complex 
multiplication theory is based on the study of Abelian varieties with complex 
multiplication by elements of K. For a real quadratic field K a description of 
certain Abelian extensions of K is given by Shimura’s theory of “real multi- 
plication”. However, in these cases the situation is less satisfactory than for Q 
or for an imaginary quadratic field K, since these constructions do not give 
all Abelian extensions of the ground field kK. A completely different situation 
takes place in the function field case, when K is a finite, separable extension 
of F,(Z). Here there is a complete description of all Abelian extensions of 
in terms of the elliptic modules of V.G.Drinfel’d (and in terms of elliptic func- 
tions in positive characteristic attached to these modules, [Dr]). This result 
gives an illustrative example of analogy between numbers and functions. 

The idea of describing extensions of K via the action the Galois group 
G(K/K) on certain groups and other algebraic objects has turned out to 
be very fruitful. Many examples of constructions of Abelian and non—Abelian 
extensions of a ground field K are based on this idea. A complete classification 
of all these extensions in terms of Galois representations and in terms of certain 
objects of analysis and algebraic geometry (automorphic forms and motives) 
is an important aim in Langlands far-reaching program, see §6.5. 


In a new book [Yos03] the main object are special values given by the 
exponential of the derivative at s = 0 of the partial zeta function of a certain 
ideal-class c attached to a number field F’. Such a special value is an important 
invariant which conjecturally gives a unit of an abelian extension of a number 
field and should give an answer to Hilbert’s 12-th problem. 


Let F' be a totally real field, i.e. F ®R & R” as an R-algebra. According 
to Shintani, the special values at non-positive integers of the partial zeta 


function 
Cr(a,f,s)= > NUD) 


Tea 
ICOp 


are rational numbers which can be expressed in terms of certain generating 
functions which generalize the generating function of Bernoulli numbers, see 
[Shin76]. 
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These generating functions are associated to “cone decompositions” in F'® 
R which are defined non—canonicaly (see also [Hi93], Chapter I). 

Conjectural absolute periods are described in [Yos03] in terms of a function 
constructed geometrically from cone decomposions, whose principial term is 
given by periodes of abelian varieties of CM-type for an arbitrary CM-field, 
explicifying a conjecture of Colmez, [Colm93]. 

Our knowledge of the nature of CM-periods is quite limited, the only 
essential fact in the case F = Q is the classical Chowla — Selberg formula, 


d-1 
Pree ay@/h_—1O,x) 
mp x (id, id) Ir) =d-exp Tox) )’ 


where K is an imaginary quadratic field of discriminant —d, w is the number 
of roots of unity in K, h the class number of K, and y the Dirichlet character 
corresponding to K [ChSe68]. Also, using the jacobian of the Fermat curve, 
G.W.Anderson found that the CM-periods in the case of a cyclotomic field 
are linked with the logarithmic derivative of Dirichlet’s LZ-functions at s = 0: 


n(c) aa 
[K : Q| L(0,n) J © 


px(id,o) ~ qo e(e)/2 II exp ( 
neG_ 


where 77 is a Dirichlet character, 4(c) = £1 or 0 (see [Ande82]). 


The conjectures of Harold Stark were made in the 1970’s and 80’s (see in 
[St’71-80]) concerning the values at s = 1 and s = 0 of complex Artin L series 
attached to Galois extensions of number fields K/F. A systematic approach 
to the Stark conjectures was presented in the book by Tate, cf. [Ta84]. In 
the most general terms these conjectures concern the special values of Artin 
L-functions of number fields and their analogies, relating them to certain 
“regulators of S-units” and analogous objects. 

More precisely the Conjecture S (on units) discussed in Chapter II of 
[Yos03], is formulated in terms of the partial zeta function 


Ce(so)= SN 

AcOr,(Af*)= 
attached to an element o of the Galois group Gal(K/F’) of an abelian extension 
kK of a totally real ground field F' ramified over only one infinite place of F 
(here (4) denotes the Artin symbol). Conjecture S says that there exists 
a unit € in K such that 

Cr(0,0) =e? 
for every o € Gal(K/F). A method is given, which derives Conjecture S on 
units from the following Conjecture on Galois action: 


Ge ) - abey 
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Here x € G denotes a (non-Abelian) irreducible character of a Galois extension 
K of F with the finite Galois group G = Gal(K/F), c(x) the leading term of 
the Taylor expansion of its Artin L-function at s = 0: 


L(8,x,K/F) = e(x)s"™ + O(s"*1), O< r(x) €Z 


and R(x) denotes a generalized regulator which is the determinant of a matrix 
of size r(x) whose entries are linear combinations of the absolute values of 
units of kK with algebraic coefficients. 


In recent years there was an important developpement in the study of “class 
invariants” of ray classes in a totally real field F’, and in their arithmeti- 
cal interpretations (which include generalizations of Stickelbergers’s theorem, 
generalization of Dedekind sums as certain cocycles etc.) A recent interpre- 
tation of Stark’s conjectures using notions of Noncommutative geometry was 


given in [Man02], [Man02a]. 


4.5 Galois Group in Arithetical Problems 


4.5.1 Dividing a circle into n equal parts 


The problem of dividing a circle into n equal parts (cf. [Gaul], [Gin85]) has a 
geometric form. However its solution, given by Gauss, was based essentially 
on arithmetical and algebraic considerations. The construction of the regular 
17-gon was the first mathematical invention of Gauss, written in his diary on 
March 30th 1796, one month before his 19th birthday. Previously one could 
only construct triangles, squares, pentagons, 15-gons, and all those n-gons 
which are obtained from these by doubling the number of sides. From the 
algebraic point of view, the construction of a regular n-gon is equivalent to 
constructing the roots of unity of degree n on the complex plane, i.e. the 
solutions to the equation 


X"-1=0, (4.5.1) 
which have the form 
nk nk nik 
cy = 608 “TE 4 isin 7 = exp ( is ). k=0,1,....n—-1. (4.5.2) 
nm nm n 


Assuming that the segment of length one is given, we can construct using 
ruler-and-compass methods all new segments whose length is obtained from 
the lengths of given segments using the operations of addition, subtraction, 
multiplication, division and extraction of the square root. Through a sequence 
of these operations one may construct any number belonging to any field L, 
which is a union of a tower of quadratic extensions 


L2G 51a 4 hs 3 =O. (4.5.3) 


where [j41 = L(V di), d; € L,. It is not difficult to prove that no other points 
of the complex plane can be constructed starting from the point z = 1 and 
using only ruler-and-compass methods. In order to construct z = a (if this is 
possible) one constructs the corresponding tower of type (4.5.3) for the field 
L, generated by all the roots of the minimal polynomial f(X) € Q[X] of a 
(the decomposition field of f(X)). By Galois theory, to a quadratic extension 
L,/Q corresponds a subgroup G; = G(L/L;) of index two in the Galois group 
Go = G(L/Q) (the group of Galois symmetries of the polynomial f(X)). The 
action of the subgroup G} partitions the set of all roots of f(X) into two parts, 
such that the sum of all elements of each part belongs to L, and generates 
this field, being invariant under under automorphisms in G1. In the next step 
each of these two parts is divided into two further parts using the action on 
the roots by elements of Gz = G(L/L2), which is of index 2 in G, etc.. This 
process continues until we obtain the subset of roots consisting of only one 
element z = a. 
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For example, for the root of unity a = €, from (4.5.2) the corresponding 
irreducible polynomial f(X) is the cyclotomic polynomial @,,(X), whose roots 
ex ((k,n) = 1) are primitive roots of unity; the Galois symmetries have the 
form 

Oa Ek? EX =Ekamod n (a € (Z/nZ)*). 


For n = 5 one has Go = {01,02,03,04} and the subgroup Gi = {01,04} 
partitions the set of primitive roots into the parts {€1,¢4} and {€2,¢3}. One 
has @;(2) = a4 +a3+a?+2+1. Hence 


eftertitey te]? =0. 
By putting wu = «1, + a) = €, + €4 we obtain the equation 


_ -14+¥5 er ca i 
a 2 ’ 263. 2 ’ 


w+u—1=0, ey + eq 


which gives the desired construction of the regular pentagon. 


Fig. 4.3. 


In the case n = 17 Gauss’ intuition led him to the correct partition of 
the roots of ®j7(x) = 21° + 2° +---+a-+41 given by Galois symmetries 
(Galois theory had not yet been discovered!). The group of symmetries Go & 
(Z/17Z)* is a cyclic group of order 16 with a generator 3mod 17 (a primitive 
root), and Gauss’s idea was to use a more convenient indexing system for 
the roots (see Fig. 13). Let us assign to the root ¢, the new number / (the 
notation €)) defined by the condition k = 3' mod 17,1=0,1,...,15, and let 
T; denote the automorphism o;,. Then 
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TyEtm] = Efm+ (m,lmod 16). (4.5.4) 
The corresponding subgroups have the form 
Go = {To,Ti,--+ , Tis}, Gi = {To,T2,--+ , Tis}, 


Go = {To,T4,--- ,Ti2}, G3 = {To, Ts}. 


We now show how the idea described above works in this case. First of all 
note that 


ep beg tert S16 = cjg) Peay bet eps Sl (4.5.5) 


(the sum of a geometric progression). Denote by om, the sum of ép) with J, 
congruent to r modulo m. We thus obtain 


02,0 = fo) + Eq] H+ + Epa = S- TiE(o, 
TiEeGi 

921 =en) tej +- +s) = D> Tiepy- 
T,EG, 


Identity (4.5.5) implies 
0290 +021 = —l, 


and by termwise multiplication we find that 
02,0 * 02,1 = 4(€[o) + En +--+ €f5]) = —4- 


Now using Viéte’s formulae, we may express 02,9 and g2,; as the roots of the 
quadratic equation x? + « — 4 = 0: 


V17-1 _ -Vv17-1 


02,0 = 2 > 92,1 5) 


which generate the field L; = Q(V17). We distinguish the two roots by the 
condition that o2,9 > 02,1; in each of these fields the roots arise together with 
their conjugates. In the first case we have to add and to multiply the real parts 
of the numbers €1, €2, €4, €g and in the second case we do the same for €s, 
€5, €6, €7- Ina similar way we have that 04,9 + 04,2 = 02,0, 04,1 + 04,3 = 92,1, 
and the multiplication using (4.5.4) shows that 04,9 - 04,2 = 02,0 +9021 = —l. 
Hence o4,9 and o4,2 are roots of the equation x +¢20+1 = 0 which generates 


the field Le: 
1 / 
Oa0 (v 17-14 /34-2V i7) s 


i(v7 1-34 2vi7). 


04,2 = 


In the same way we see that 
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(-vi7-1+ 34+2vi7), 
042 = : (-vii- 1— 134+ vi) 


An analogous argument shows that 


041 = 


Ble 


78,0 = Elo] + €[8] 


20 
— 2 — 
COs i7 


= (1/2) - \/ (04,0)? — 404,1) 
= (1/8) -(V17 — 1+ 1/34 — 2v117) 


(1/4) 2 yir+avi7— 170 + 38717, 


which completes the construction. 
In the general case of an n-gon with n = 2"pj'-...-p%*, where p; are odd 
primes, we have that 


G(Q(¢n)/Q) = (Z/nZ)*. 


By considering the tower (4.5.3) of quadratic extensions, we see that the pos- 
sibility of constructing the regular n—gon is equivalent to the condition that 
the number 


|(Z/nZ)*| = p(n) = 2" "(py — Up's +++ (ps — pet 


is a power of 2. This holds precisely when n = 2"p,----- ps, where the p; are 
primes such that p; = 2™ +1. It follows from Lagrange’s theorem applied to 
the cyclic group (Z/p;Z)*, that m; divides p; — 1. Hence m; is also a power 
of 2. The construction is therefore only possible for n = 2"p, -----p; where 
p; are Fermat primes p; = ga 4 1, which were discussed in Part I, §1.1.2. 
The proof of the latter statement was not published by Gauss: “Although the 
framework of our treatise does not allow us to proceed with this proof, we 
think that it is necessary to point out this fact, in order to prevent somebody 
else from wasting his time, by attempting to find some other cases, which are 
not given by our theory.” 


4.5.2 Kummer Extensions and the Power Residue Symbol 


(see [CF 67], [Chev40], [Koch70]). Let K be a field containing a primitive root 
of unity ¢ of degree m, where m is a fixed positive integer not divisible by 
the characteristic of K. One may show that cyclic extensions L/K of degree 
dividing m coincide with the so-called Kummer extensions of type K( %/a)/K 
(a € K). In applications K will be either a number field or a completion of 
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one. Any extension L/K containing a root a of X™ = a, also contains all the 
other roots Ca, ..., ¢’~ta of this polynomial. Let o be an element of the 
Galois group G(K( %/a)/K). If we fix a root a then the automorphism @ is 
completely determined by the image of a under the action of a: a? = Ca. In 
particular, if @ is an element of order m in the multiplicative group K* /K*™ 
then X™ — a is irreducible and a” is an m* power iff m|r. In this case the 
assignement 0 ++ bmod m provides an isomorphism of the Galois group 
G(K( %/a)/K) with the cyclic group Z/mZ. 

Now let LZ be an arbitrary cyclic extension of degree m of K. We shall 
construct explicitly an element a € K such that L = K( %Ya)/K. Let o bea 
generator of the cyclic group G(L/K) and let L = K(7) for some primitive 
element y € L. Then the elements y, y’, ..., yor form a basis of L over 
k. Consider the sum 


m-1 
B= SVC. (4.5.6) 
s=0 


Then 87 = ¢~!6, and 6 ¥ 0 since the elements y, 7, wort are linearly 
independent over kK. Thus 8” € K and G" ¢ K for0O<r<m,ie. a= B™ 
is an element of order m in the quotient group K*/K*™ and the above 
argument shows that the field K(@) is a cyclic extension of degree m contained 
in L and is therefore equal to L = K( %/a). In a similar way we can check 
that two extensions K( %/a)/K and K( %/b)/K coincide iff a = b’c” for some 
c€ K and r € Z such that (r,m) = 1. These statements can be unified into 
one statement by saying that for a given field K D> pz», and its Galois group 
Gx = G(K/K) there is the isomorphism 


K* /K*™ & Hom(Gx, Him); (4.5.7) 
where fim = {¢ € K | ¢™ =1} and Char K | m. In order to construct (4.5.7) 


for a given a € K™ choose y € K © with the condition y™ =a, and ford € Gx 
the formula ya(o) = 7 /y defines then a homomorphism vy, : GK — [m. The 
fact that this map defines a homomorphism (4.5.7) is deduced from Hilbert’s 
Theorem 90 on the cohomology of the multiplicative group: H!(Gx,K*) = 
{1} (see §4.5.3). 

Now let kK be a number field, ppm C K, p = p, a prime divisor attached 
to a non—Archimedean place v of K. The decomposition of p in the extension 
K( %/a)/K is reduced to study of the extension K,( %/a)/K, of the local field 
Ky, (by the construction of extensions of absolute values, see §4.3.6). One can 
assume that a belongs to the ring Ox of integers of K, and that p / ma. 
Then the decomposition of the maximal ideal p C Ox is determined by the 
decomposition of the polynomial X'™—a(mod p) over the field Ox /p (by the 
lemma in §4.2.3). This decomposition is a product of pairwise coprime irre- 
ducible factors of degree f, where f is the degree of the residue field extension: 
the least positive integer f such that the congruence af = x™( mod p) is solv- 
able in Ox /p. Under our assumptions the ideal p is unramified in L = K( %/a) 
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and p = By, +... Bu, (f+ r = m). In particular, p splits completely iff 
f =1, ie. iff the congruence x” =amod pf is solvable. 

We now define the power residue symbol. In order to do this denote by S 
the set of places of K which either divide m or are Archimedean. For elements 
a1,...,a, € K* denote by S(a1,...,a,) the union of S and the set of places 
v for which |a;|, 4 1 for some 7. For a € K™* and a place v ¢ S(a) define the 
power residue symbol (£) € fm by 


wfqktin() — (<) wa, (4.5.8) 


UV 


where L = K( ¥/a) and Fr /«(v) € G(L/K) is the global Artin symbol, cf. 
§4.4.6. The number (4) € [lm does not depend on a choice of %/a, and one 


verifies that 
(=) . (2) (<) (v ¢ S(a,a’)). (4.5.9) 


According to the definition of F,/%(v) as a Frobenius element, the identity 
pes (4) (mod p,), which im- 


(4.5.8) is equivalent to the congruence %/a 
plies that m|(Nv — 1) and 


giNe-l)/m — (=) (mod p,), (4.5.10) 
Uv 


(the generalized Euler criterium) since the group (O,/p,)”* is cyclic of order 
Nv ~ 1. For an arbitrary divisor 6 =|] ,¢¢(a) ph) © 75(@) put 


6)- IG. 


Then we have that 


agit 8) = (5) wa, (4.5.11) 


where L = K( %/a), Fr/«(8) € G(L/K) is the global Artin symbol, and the 
following equation holds: 


(a7) 7 (5) @ (8,6 € PP). (4.5.12) 


For any prime divisor v ¢ S(a) the following statements are equivalent: 


1) (f) =b 
2) the congruence x” = a(mod p,) is solvable for some x € Oy; 
3) the equation «™” = a is solvable for some x € Ky 


178 4 Arithmetic of algebraic numbers 


A solution in 2) can be lifted to a solution in 3) by Hensel’s lemma, see 
84.3.2. For an integral ideal 3 C Ox the value of (5) depends only on a mod 


G as long as a € Ox. Thus the following character of order m is defined 


x8: (Ox/B)" > Um, Xxp(a) = (5) ; (4.5.13) 


The cubic reciprocity law. Let K = Q(¢3) = Q(V—3), m = 8. Then Ox = 
Z|¢3] is a principal ideal domain and if p = p, = (7) for a prime element 


am, then we shall use the notation (4) instead of (2). Call a prime element 


primary if 7 = 2 mod 3, ie. either 7 = q is a rational prime number, g = 
2(mod 3), or Na = p = I(mod 3), 7 = 2 mod 3. One easily verifies that 
among the generators of an ideal p, p /3 there is exactly one primary element. 
Let p,; = (71) and po = (72), where 7, and 7 are coprime primary elements 
such that Np; 4 Npo 4 3. Then the following “reciprocity law” holds: 


(=) zi (=) (4.5.14) 


The biquadratic reciprocity law. Let m = 4, K = Q(t) and Ox = Z{i], the 
ring of Gaussian integers. We shall call a € Ox primary if a = 1(mod (1+ 
i)?). Then one verifies that in any prime ideal p, p /2 one can choose a unique 
primary generator. If py = (71) and po = (72), where 71 and 7 are coprime 
primary elements, then the following reciprocity law holds: 


(=) = (=) (SU Den. (4.5.15) 


72 TY 


4.5.3 Galois Cohomology 


The group cohomology theory provides a standard method of obtaining arith- 
metical information from Galois groups, acting on various objects: algebraic 
numbers, idele classes, points of algebraic varieties and algebraic groups etc. 
(cf. [Se58], [Se63], [Se64], [Chev40], [Ire82], [Koch70], [Koly88] [Wei74a]). Let 
G be a finite (or profinite) group acting on a G-module A (endowed with 
the discrete topology). The cohomology groups of G with coefficients in A 
are defined with the help of the complex of cochains. Consider the following 
Abelian groups: 
C°(G, A) = A, 


and for n> 1 


C"(G,A)={f:Gx---x G—>A| f is continuous} 
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(the addition of functions is pointwise and the continuity of f € C”(G, A) 
means that the function f(g1,...,9n) depends only on a coset of g; modulo 
some open subgroup of G). 

The formula 


(dnf)(91,---+9n41) =91f (G2,--+,9n41) 


n 


T So(-D* f(a, GiGi pore ne) 


i=1 
+ (-1)"*" f(gi, ++ 9n)s (4.5.16) 


defines a homomorphism d,, : C"(G, A) > C"*1!(G, A), such that dy odn41 = 
0. 


The group Z"(G, A) = Kerd, is called the group of n—cocycles, and the 
group B"(G, A) = Imd,,_ is called the group of n—coboundaries. The property 
dno dy+1 = 0 implies that B’(G, A) C Z"(G, A). The cohomology groups are 
then defined by 


> . 
HG, A) = BY(G, A)/Z"(G, A) = | Ketdn/imdn-1 form2 ty og 5 
Kerdo for n = 0. 
If n = 0 then 
H°(G, A) = A° = {a€ A| ga =a for all g € G}. (4.5.18) 


For n = 1 we call a continuous map f : G — A a scew-homomorphism iff for 
all 91, g2 € G one has 


f(g192) = f(g) + nf (92): (4.5.19) 


One says that a scew—homomorphism splits, iff for a fixed a € A it can be 
written in the form f(g) = a — ga. The group H1(G, A) can be identified 
with the quotient group of the group of all scew—homomorphisms modulo 
the subgroup formed by all split scew—homomorphisms. If the action of G 
on A is trivial then H1(G, A) coincides with the group of all (continuous) 
homomorphisms from G' into A. 

For n = 2 the elements of H?(G, A) correspond bijectively to equivalence 
classes of extensions of G by A. Consider an extension 


(a ASC = 6S, (4.5.20) 


For all g € G choose a lift g in G (i.e. choose a section g +> g of the projection 
G — G). Define f : Gx G— A, f(91, 92) € A by 


91° G2 = f (91,92) 9192- 


Then the function f is a 2-cocycle of G with values in A. If we change our 
choice of representatives g (i.e. the choice of section G — G), then f is altered 
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by a coboundary. Hence the class of f depends only on the extension (4.5.20). 
The group H*(G,C%) is called also the Schur multiplier of G. Let L/K be a 
Galois extension with Galois group G = G(L/K). Then L* is a G-module 
and H?(G,L*) can be interpreted as the Brauer group, see §4.5.5. 

For the action of the Galois group G = G(L/K) on L* one has the fol- 
lowing fundamental theorem. 


Theorem 4.25 (Hilbert’s Theorem 90). 
H'(G(L/K), L*) = {1}. 


The idea of the proof of this theorem is the same as in the description of 
all cyclic extensions of K in §4.5.2. Let f : G — L™ be an arbitrary scew— 
homomorphism, f € Z'(G(L/K), L*). In multiplicative notation this means 
that for all g, h € G we have f(h)9 = f(gh)/f(g) € L*. We shall find an 
element b € L* such that for all g € G one has f(g) = b/b9. In order to do 
this choose a primitive element + in the extension L/K, so that the elements 
9 (g € G) form a normal basis of L over K. Then the element 


b= f(h)y EL (4.5.21) 
heG 


is not equal to zero. We apply to both sides of (4.5.21) an element g € G. 
Then 


w= So Fala") 


heG 


=a 


heG 


={@-* >. fh" 


heG 
= f(g)~*b 


(by the formula of the (left) action of G on L*: (y")9 = 9" for g,h € G). 
This method of taking the average is also known as the construction of the 
Lagrange resolution in the theory of solvable extensions of fields. 


Properties of cohomology groups. 


1) For an arbitrary exact sequence of G-modules 


0 A B C 0 


the following long exact sequence of cohomology groups is defined: 
0 — H°(G, A) — H°(G, B) — H°(G,C)*H(G, A) 
— H1(G, B) — H\(G,C)25H?2(G, A) > ---H"(G,A) 
— H"(G, B) — H"(G,C)2>H"™*1(G, A) (4.5.22) 
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Example 4.26. Kummer theory. Let K be a field containing the group tm 
of all roots of unity of degree m in K. Assume further that Char K does 
not divide m. For an arbitrary Galois extension L/K with Galois group 
G = G(L/K) the map «+ «x defines a homomorphism of G-modules: 
v: LX —+ L*. For L = K, and G = Gx, one has the following exact 
sequence 

=X oY >X 


tt Ke ee 


Passing to cohomology groups (4.5.22) we obtain the following long exact 
sequence 


H°(Gx, [lm) anid H°(Gx,K)+H(Gk,K 
H(Gx, bm) — H\(Gx,K™)>H(Gx,K") —---. (4.5.23) 


Since the group Gx acts trivially on fim, it follows that H!(Gx, Um) 
coincides with the group Hom(Gx, Jim). The group H®(Gx,K”) is the 
subgroup of all fixed points of the Galois action, ic. H°(Gx,K*) = 
K*Gx(K/K) — K*, Also, H°(GK, lim) = Jim, and H!(Gx,K™) = {1} 
by Hilbert’s theorem 90. We thus have the following exact sequence 


1 this K* —K* Hom(Gx, fm) — 1, 


which is equivalent to the isomorphism of Kummer: 
K*/K*™ = Hom(G x, fm). 


2) Let H be an open normal subgroup in G and A a G-module. Then one 
has the following “inflation - restriction” exact sequence: 

0 — H(G/H, AH) #1(G, A)2S8H"(H, A), (4.5.24) 
in which Inf denotes the inflation homomorphism, which is defined by 
“inflating” a cocycle f on G/H with values in A” C A to a cocycle f on 
G; and Res is the restriction homomorphism given by restricting cocycles 
on G to the subgroup H. 

3) U-products. Let A, B, C be three G-modules, for which some G-invariant 
pairing o: A x B > C'is given (ie. for all g € G,a € A, b © B we have 
that g(ao b) = gao gb). For example, if A = B = C is a ring on which the 
group G acts trivially, then the multiplication in A is such a pairing. Any 
pairing A x B — C induces for every n > 0 and m > 0 a bilinear map 


H™(G, A) x H™(G, B) — H™™(G,0), (4.5.25) 


which is called U-product. This is defined on cocycles by the following 
rule. If f ¢ C"(G, A), f’ © C™(G, B) then the cochain 
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(fo fn, +5 9ntm) = f(91s°++ 5 9n) O(M-- Gn) (Gusts ++ 5 Intm)) 
(4.5.26) 


turns out to be a cocycle, as can be seen from the following equation: 
Intm(f ° f’) = dnf o fi +(-1)"fodmf’. 
The U-product (4.5.25) is well defined by the formula 
fuUffRefof eA’ (G,C): 
One has the equation 
AU AnB = (-1)"Anim(aU @), (4.5.27) 


where A,, is the “connecting homomorphism” of the long exact sequence 
(4.5.22). If A = B = C is a commutative ring on which G acts trivially, 
then for all a € H"(G, A), 6 © H™(G, A) one has 


aU B= (-1)""6Ua. (4.5.28) 


4.5.4 A Cohomological Definition of the Local Symbol 


Let K be a finite extension of the field Q, of p-adic numbers. The local Artin 
symbol is a homomorphism 


0: KX > G3? = limG(L/K) (4.5.29) 


from the multiplicative group of K to the Galois group of the maximal Abelian 
extension (the union of all finite Abelian extensions L/K) of K. This homo- 
morphism was described in §4.4 using powerful global methods — the Artin 
reciprocity law. However, the local symbol can be defined purely locally. With 
this approach the global reciprocity law can then be deduced from the prop- 
erties of the local symbols by proving the product formula (4.3.31). 

We shall define for a given a € K* the image 6(a) = 0,/K(a) € G(L/K) 
(in a finite extension L/K) using the characters y € Hom(G(L/K),Q/Z). 
Note that the element 6(qa) of the finite Abelian group G(L/K) is completely 
determined by the values x(6(a)) for all characters y of G(L/K). For the 
trivial G(L/K)-module Q/Z we have: 


Hom(G(L/K),Q/Z) = H'(G(L/K),Q/Z), 


and there is an exact sequence 


0-Z3Q-Q/Z0, 


which gives rise to the isomorphism 
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A, : H'(G(L/K),Q/Z) ~ H?(G(L/K),Z). (4.5.30) 


The latter is found by considering the long exact sequence (4.5.22), and using 
the fact that all the higher cohomology groups of the divisible group Q are 
trivial: H*(G(L/K),Q) = {0} for i> 1. 

As we have seen in §4.5.3, H'(G(L/K), L*) = {1}. Moreover, the following 
fundamental facts on the cohomology groups of the multiplicative group are 
known: 


a) H3(G(L/K), L*) = {1} (L is a local field) 
b) There exists an embedding 


invx : H?(G(L/K), L*) > Q/Z. (4.5.31) 


The image of an element 8 € H?(G(L/K),L*) under this embed- 
ding is called the invariant of 3. For a finite extension L/K the group 
H?(G(L/K), L*) is cyclic of order [L: K]. 


Now consider the pairing 

LX xZ—>L* ((a,m)r2™). 

This induces a U—product in the cohomology groups 

H®(G(L/K), L*) x H?(G(L/K),Z) — H?(G(L/K), L*). 
Recall that H°(G(L/K), L*) = K*. For A,y € H?(G(L/K),Z), we have 
aU Ax € H?(G(L/K), L*) 

for a € K*. Define for each character y, 

x(9r/K(@)) = inv«(@U A1x). (4.5.32) 


This determines 67;/% (a) as a well defined element of G(L/k). Passing to the 
projective limit in (4.5.29), we obtain an element 


O(a) = lim Or /x(@) € Ge: 


To do this we need the following compatibility property. Consider a tower of 
(Abelian) Galois extensions kK Cc L’ Cc Land let G= G(L/K), H = G(L/L’). 
Let x’ be a character of G. Then if a € AK™ induces an element sq = 0,/K (a) € 
G and the element s’, € G/H under the projection G — G/H, then we have 
that x(sq) = x’(s/,). This follows from the definition y(s,) = invx(a@U Ax) 
together with the fact that the inflation map takes y’ (respectively, A;y’) to 
the character y (respectively, to Aix), using the commutative diagram 
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Inf 


H?(G/H,L'*) H2(G, L*) 


Q/Z 
(4.5.33) 


The map inv x will be defined in the next subsection via the Brauer group. The 
above compatibility property will also be discussed there. This compatibility 
property is very important, since it makes it possible to define the symbol 
(4.5.29). 

If the field K contains a primitive root of unity ¢,, of degree m, then the 
power residue symbol (a, 3) of degree m can be defined for a, 6 € K* by the 
condition 


Or/K(B)+ Wa = (a,8)- Va, (4.5.34) 


where L = K(¥/q) is a cyclic extension and 0;/%(3) is the local symbol 
(4.5.32). The values of (a, @) are roots of unity of degree m, and they satisfy 
the following conditions: 


aa’, 3) = (a, 8)(a", 8); 

, BB") = (a, B)(a, 2’); 

9)(B,a) = 1, 

a, 3) =1 for all G € K* thena€ K*™; 

(a, 3) = 1 iff G is a norm of an element in the extension K( %/a)/K. 


The power residue symbol symbol can be interpreted as a U-product in 
certain one-dimensional cohomology groups, cf. [Koch70]. An explicit calcu- 
lation of this symbol is given in [Koly79], [Sha50]. 


4.5.5 The Brauer Group, the Reciprocity Law and the 
Minkowski—Hasse Principle 


Recall first some basic facts about the Brauer group of an arbitrary field K 
(see [Man70b], [Man72b], [Se63], [Se86] [Chebo49]). 

A finite dimensional algebra A over K is called a simple central algebra 
over K, if there exist n > 1 such that A@ K & M,,(K), where M,, denotes the 
n Xx n-matrix algebra and K is an algebraic closure of K. The tensor product 
induces a commutative semigroup structure on the set of simple central kK— 
algebras (modulo isomorphism). The following equivalence relation turns this 
set into a group: we say that an algebra A is equivalent to an algebra B, if 
there exist m,n > 1 such that A ® M,,(K) is isomorphic to B ® M,,(K). 
All matrix algebras are equivalent to each other, and they form the identity 
class of algebras. The class of the algebra A°, inverse to A (i.e. consisting 
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of the same elements and having the same addition but the opposite order 
of multiplication), is the inverse of A in the group structure induced by the 
tensor product. To see this, consider the canonical map A @ A° — Endx(A) 
(endomorphisms of the linear space A), which assignes to an element 1 @ y € 
A ® A®° the multiplication by x on the left, followed by the multiplication 
by y on the right. The kernel of this map is trivial, since A @ A® is simple, 
and the dimension of A ® A®° coincides with the dimension of End, (A), ie. 
with (dim A)?. Hence the map is an isomorphism, so A @ A° is isomorphic to 
Endx (A) — Maim A(X). 

The group of classes of central simple algebras over K is called the Brauer 
group of Kk and is denoted by Br K. We shall now describe the Brauer group 
in cohomological terms. 

Let L/K be an extension of K. It is called a splitting field of a K-algebra 
A iff A@x L = M, (ZL). Equivalent algebras have the same splitting fields. 
Let Br(K,L) be the subset of the Brauer group, consisting of those classes 
of K-algebras which split over L. This is a subgroup. Now asume that L/K 
is a Galois extension with Galois group G = G(L/K). One has the following 
fundamental isomorphism: 


Br(K, L) = H?(G,L*). (4.5.35) 


This isomorphism can be constructed in various ways; we point out one of 
these, the so-called construction of “scew—products”. This method consists of 
explicitly constructing a central simple algebra over K from a given “factor 
set”, ie. from a cocycle {ag,,} € Z7(G,L*). The algebra is construced as 


follows: 
A= ‘ae Leg, 
gEG 


with multiplication given by 
€g€h = Gg,h€gn for all g, hE G, 


€ga = g(a)eyg for all gE G. 


Its dimension over K is obviously equal to [L : K]?. We omit to verify the 
various necessary properties of the construction; note only that the associa- 
tivity of A is equivalent to the fact that the cochain of structural constants is 
actually a cocycle. 

The condition that A splits over L has important arithmetical implications. 
Put N = n? and choose a basis {a1,...,an} of A over K. If we use the 
isomorphism 


F:A®x K >M,(K), (4.5.36) 


then all of the elements a = an xia; € A (x; € K) become matrices F(a) € 
M,(K). Then it is not difficult to check that the maps 
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T(a) = Tr(F(a)), v(a) = det(F(a)) 


are polynomial functions of 71,...,x2yx with coefficients in the ground field Kk. 
These maps are called respectively the reduced trace and the reduced norm of 
the element a € A, cf. [Wei74a]: 


T(a) =14(%1,%2,...,¢y) a linear form, 


v(a) = ®4(#1,2%2,...,%N) a homogeneous polynomial of degree n. 


Since F(ab) = F'(a)F(b) by the isomorphism (4.5.36), v(ab) = v(a)v(b). How- 
ever, in case the algebra A is a division algebra, notice that each non-zero 
element of A is invertible. Thus the form @,4 has no non-trivial zero over K. 
On the other hand if A@x L = M,(L), then @,4 does have a non-trivial zero 
over L; under this isomorphism the solutions to the equation 


P4(a1,...,UN) =0 (x; E L) (4.5.37) 


correspond exactly to degenerate matrices. 
We now describe the local invariant (cf. [Chev40], [Se63]) 


invx : Br K —+ Q/Z (4.5.38) 


in the case when K is a finite extension of Q,. Let A be a central division 
algebra (a scew-field) over the field K, [A : K] = n?. The valuation v = vx 
of K has a unique extension to a valuation vy of A, coinciding with vx on 
the center of A. For example, one can first extend v over local fields K(a) for 
a € A and then use the compatibility of these continued absolute values (in 
view of the uniqueness property of continuations of absolute values to finite 
extensions of a local field). Considering the reduction of the algebra A modulo 
the valuation v, one checks that A contains a maximal commutative subfield 
L unramified over the center K, and an element 6 € Br K corresponding to 
A splits over L, i.e. 6 € H?(G(L/K), L*). A maximal unramified extension L 
may not be unique in A, but all these extensions are conjugate in view of the 
theorem of Skolem—Noether. This theorem states that each automorphism of 
Lin A over K is induced by an inner automorphism of A. Consequently, there 
exists an element y € A such that yLy~! = L and the inner automorphism 
xt yxy~', restricted to the subfield L, coincides with the Frobenius auto- 
morphism Fr;/;%. Moreover, the element y is uniquely defined upto a factor 
from L*. Let v4 : AX — 1Z, be an extension of vx onto A. Then one can de- 
fine inv 6 as the image of v4(¥) in the group (4Z)/Z C Q/Z. This definition 
may be restated, taking into account the fact that the map 7 > y"ay~" is 
equal to Fr/,/, and is thus the identity (since n = [L : K]). It therefore follows 
that the element 7” commutes with all elements of L and y”" =ce€ L*. This 
gives us 


va() = —va(7") = —va(c) = —01(0). (4.5.39) 


nm n nm 
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Thus we have that 
invKd =i/n (c=T7yzU), 


where u € OF, my is a uniformizing element in L, ie. vz(7_) = 1, vz (u) = 0. 

Passing to the global case, we consider a Galois extension of number fields 
L/K with Galois group G = G(L/K). Let G’ C G denote the decomposition 
group of an extension w of a place v to L. If the extension L/K is Abelian 
then we know that the group G” is uniquely determined by v (cf. §4.4). The 
inclusion L — L,, induces a homomorphism 


(py : H7(G,L*) — H?(G", LX). (4.5.40) 


One verifies that for an element a € H?(G,L) the images y,a vanish 
for almost all v (all but a finite number): if a cocycle {ag} € Z?(G, L*) 
representing a satisfies the condition ag, € O% and the extension L,,/K, is 
unramified, then H’(G’,O*) = 0 for i > 1. This fact is deduced from the 
exact sequence of cohomology groups obtained from the short exact sequence 


(S06 7 0, 


Ww 


This is actually a version of Hensel’s lemma , cf. 84.3.2. 
Thus there exists a well defined map 


H?(G,L*) — @ H°(G", Lz) (4.5.41) 


where w is a fixed continuation of a place v and the summaton runs through 
all places v of K. In this situation the local invariants 


invx, : H°(G’, LX) — Q/Z 


Ww 


induce a map 


GD H°(G", LX) — Q/Z, (4.5.42) 


which is defined to be the sum of all the local invariants. 
The Minkowski-Hasse Local—Global Principle states that that the sequence 


0 — H?(G,L*) PH (G", LX) — QZ, (4.5.43) 


Uv 


obtained from (4.5.41) and (4.5.42), is exact. 

This exact sequence (4.5.43) plays a key role in many arithmetical ques- 
tions. For example, the statement that (4.5.41) is an embedding is equivalent 
to saying that for the reduced norm 


v(a) = a(x, %2,...,2N) (4.5.44) 
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the Minkowski-Hasse principle holds, i.e. the form v(a) = @4(21, v2,...,Un2) 
has a non-trivial zero over L iff it has a non-trivial zero over each completion 
of L. 

The exactness in the middle term 6,H?(G’, LX) describes completely 
the classes of cenral simple algebras A which split over L. They correspond 
bijectively to tuples of numbers i(v), 0 < i(v) < n, the sum of which is 
divisible by n; for some algebra A with the class 6 € H?(G,L*) one has 
inv K, (5) = i(v)/n € 4Z/Z. 

Finally, the statement that for 6 € H?(G,L*) one always has 


inv, (Go(4)) = 0 € Q/Z, 


Vv 


is essentially equivalent to the product formula for local symbols (4.5.38), and 
to the global reciprocity law. 

Indeed, if @ = (ay)y € Jx is an idele, then the global Artin symbol 0(a) € 
G% is defined as the limit 6(a) = lims[],<54.(av) where the product is 
finite, and the local symbols are defined by the condition 


x(Oy(Qy)) = invx, (aU Aix) (4.5.45) 


(see (4.5.32)) for all characters y € H'(G#2 ,Q/Z). 
Ifa € K*, ie. ay = a € K% for all v, then for all characters y € 
H*(G3>, Q/Z) one has 


x (11 ro) = SJ inv, (@U Aix) = 9, 


VU 


since the element 
aU Aix € H?(G3?,Q/Z) 


belongs to the global Brauer group. 
In the case when the extension L/K is cyclic, one can construct using 
purely cohomological methods a canonical isomorphism 


H?(G(L/K), L*) © K*/NrKL” (4.5.46) 
and the exact sequence (4.5.43) implies the following: 


Theorem 4.27 (Hasse’s Theorem on Norms). /f a € K* and L/K a 
cyclic extension, thena€ Nz KL if and only ifae Nyx, Lw for all places 
v of K. 


In particular, let G be the group of order 2, so that L = K(/b). Then 
Nzj«(ct+yvb) = 2?—by?. Hence acan be represented by the form x?—by? over 
kK iff it can be represented by it everywhere locally, i.e. over every completion 
of K. This implies that a quadratic form Q(a, y, z) in three variables over K 
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has a non-trivial zero over K iff it has a non-trivial zero over every completion 
of K. Passing to arbitrary n we obtain the Minkowski—Hasse theorem, which 
states that a quadratic form has a non-trivial zero over K iff it has a non— 
trivial zero everywhere locally, cf. [Chev40], [Cas78]. 


It was pointed out to us by B.Moroz (MPIM-Bonn), that Hasse’s Theo- 
rem on Norms may hold for some non-cyclic extensions, providing interesting 
examples of the validity of the Minkowski—Hasse principle, and Theorem 6.11 
of [P]1Ra83] at p.309 gives an interesting cohomological condition for Hasse’s 
Theorem on Norms to hold. 


5 


Arithmetic of algebraic varieties 


5.1 Arithmetic Varieties and Basic Notions of Algebraic 
Geometry 


5.1.1 Equations and Rings 


(cf. [Sha88], [Sha87], [Bou62]). The machinery of algebraic geometry uses com- 
mutative rings instead of equations. Replacing a system of equations by a ring 
is similar to replacing an algebraic number given as a root of a polynomial by 
the corresponding field (or ring) extension. Consider a system of equations 


X:F(Tj)=0 (i €L,j € J). 


Here J and J are index sets; the T; are independent variables; F; are polyno- 
mials from the ring K[Z;] and K is a commutative ring. We shall say that X is 
defined over kK. Now the question arises, which objects should be called solu- 
tions of the system X? There is an obvious definition: it is a family (t;),7 € J, 
of elements of K such that F;(t;) =0 for all i € I. However, this definition is 
too restrictive. We could also be interested in solutions not belonging to K, 
for example the complex roots of a polynomial with rational coefficients. In 
general, consider a K-algebra L. 


5.1.2 The set of solutions of a system 


Definition 5.1. An L—valued solution of X is a family (t;),j € J of elements 
of L such that F;(t;) =0 for alli € I. The set of all such solutions is denoted 
X(L). 


Since every ring is a Z-algebra, if X is defined over Z then we can con- 
sider its solutions with values in any ring. Let f : DL, — Lz be a K-algebra 
homomorphism, ie. a homomorphism of rings and of K—modules. Then for 
any L,—valued solution (t;) of X, (f(t;)) is an L2—valued solution. Hence f 
induces a map X(L1) — X(L2). 
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5.1.3 Example: The Language of Congruences 


Let n be an integer of the form 4m + 3. Here is the classical proof that n is 
not a sum of two integral squares: if it were then there would be a solution 
to the congruence T? + T? = 3 mod 4, whereas a short case-by-case check 
shows that this is unsolvable. From our new viewpoint this argument can be 
rephrased as follows. Let X denote the equation T?+T}—n = 0 (K = Z). We 
want to prove that X(Z) = @. Consider Z/4Z as Z-algebra via the reduction 
homomorphism Z — Z/4Z. There is then an induced map X(Z) — X(Z/4Z). 
If X(Z) were non-empty, X(Z/4Z) would also be non-empty, which is false. 
In general, for any system X over Z, if X(L) is empty for some algebra L, then 
X(Z) is empty. In practice one usually tests for solutions in the finite rings 
Z/mZ and the real numbers R. A more satisfactory theoretical formulation 
uses p-adic fields and the ring of adéles (see Chapter 4, §4.3). 


5.1.4 Equivalence of Systems of Equations 


Definition 5.2. Two systems of equations X and Y with one and the same 
family of indeterminates over a ring K are called equivalent if X(L) = Y(L) 
for each K-algebra L. Among all systems equivalent to a given one X, there 
is a largest one. Its left hand sides form the ideal P generated in K[T;] by 
the F;(T;). In order to see that this is equivalent to X, it suffices to take 
L = K(Til/P. 


5.1.5 Solutions as K-algebra Homomorphisms 


We summarize the results of our discussion. Starting with the system X as 
above, we construct the algebra A = K[Tj|/P. Then for any K—algebra L we 
have a natural identification 


X(L) = Homg(A, L). 


The system X is called solvable, if X(L) is non-empty for some non-trivial 
(that is, with 0 # 1) K-algebra L. One sees that X is solvable iff 1 is not 
contained in P. 

We have established the equivalence of two languages: systems of equations 
up to equivalence and algebras with a marked family of generators. Forgetting 
about the generators, we identify further those systems of equations that 
are related by invertible changes of variables. Each element of A can play 
the role of an indeterminate in a suitable system. The value taken by this 
indeterminate at a given solution is equal to its image with respect to the 
homomorphism A — L corresponding to this solution. 

In classical algebraic geometry, an (affine) algebraic variety over an alge- 
braically closed field K = K is defined to be the set Z C K” of common 
zeroes of a system of polynomials 
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Fyi(T1,...,Tn) € K[N,..., Tr]. 
The ring of regular algebraic functions on Z is by definition, 


A=K[Z])=K[T,...,Tnl/Pz, 


where Pz is the ideal consisting of all polynomials vanishing on Z. Obviously, 
A is a finitely generated K-—algebra without nilpotents. Conversely any such 
algebra is of the type K[Z]. 

The abstract notion of a scheme allows us to consider an arbitrary com- 
mutative ring A as a set of functions on a space Spec(A). 


5.1.6 The Spectrum of A Ring 


Definition 5.3. The set of all prime ideals of a (commutative) ring A (dis- 
tinct from A) is called the spectrum of A and is denoted Spec(A). An element 
x € Spec(A) is called a point of the spectrum; the corresponding ideal is de- 
noted py C A. 

Recall that an ideal p Cc A is prime iff the quotient ring A/p has no zero 
divisors. We shall denote the field of fractions of A/p, by R(x). 


5.1.7 Regular Functions 


Each element f of A defines a function on Spec(A) whose value at a point 
x is the residue class f(~) = f mod p, considered as an element of R(z). 
Two distinct elements of A may take the same values at all points of the 
spectrum. This happens iff their difference belongs to the intersection of all 
prime ideals of A, i.e. to the ideal of all nilpotent elements of A (cf. [Bou62], 
[SZ75]). For this reason, the rings of functions of classical algebraic geometry 
usually contained no nilpotents. However, this restriction is unnatural even in 
many classical situations, since nilpotents arise geometrically when an alge- 
braic variety depending on a parameter degenerates in a certain way (e.g. a 
polynomial acquires multiple roots). For this reason nilpotents are allowed in 
modern algebraic geometry, and all elements of A are thought of as pairwise 
distinct regular functions on the spectrum. 

We now define a canonical topology on Spec(A). A minimal consistency 
requirement of this topology with a given set of functions is that the vanishing 
sets of all functions are closed. 


5.1.8 A Topology on Spec(A) 


For any subset EF Cc A, denote by V(E) C Spec(A) the set of all points 

x € Spec(A) for which f(x) = 0 for all f € FE. The family {V(E)} consists of 

all closed sets of a topology on Spec(A) called the Zariski, or spectral topology. 
Each ring homomorphism y : A — B induces a continuous map 
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“ : Spec(B) — Spec(A). 
By definition for y € Spec(B), we have 


Pep(y) = yp" (py). 


Each set V(£) is itself a prime spectrum: V(/) can be identified with 
Spec(A/Per) where Pr is the ideal generated by E. This identification is in- 
duced by the canonical homomorphism 


A-— A/Pp. 


There is also an important basis of open subsets of Spec(A) consisting of 
the sets D(f) = Spec(A[1/f]) for f € A. In fact for each E C A we have 
Spec(A)\V(E) = UyerD(/). 

The spectra Spec(A) have very non-classical topologies. As a rule, these 
spaces are not separable. The closure of any point « € Spec(A), can be de- 
scribed as follows: 


f= U Vie)=V(U 2) =Vire) = ty € Spec(A) , py > Pek 


BCD ECDaz 


In particular this space is isomorphic to Spec(A/p,), so only the points cor- 
responding to the maximal ideals are closed. If y € {2}, one sometimes says 
that y is a specialization of x; this is equivalent to p, C p,. If A has no 
zero divisors then the ideal (0) € Spec(A) corresponds to the generic point of 
Spec(A), whose closure coincides with the whole spectrum. One can imagine 
that the points of Spec(A) have different depths which can be, loosely speak- 
ing, measured by the number of specializations of the generic point necessary 
to reach a given point. This idea leads to one of the definitions of dimension in 
algebraic geometry. A sequence %o,21,.--,%n of points of a topological space 
X is called a chain of length n beginning at xo and ending at zy if 7; A vi44 
and x41 is a specialization of x; for all 7. The dimension dim(X) is defined 
to be the maximal length of such chains. 

For example in X = Spec K[T},...,T,] (where K is a field) there 
is a chain (0) C (Ti) Cc... C (N%,...,T,), so dim(X) > n. Similarly, 
dim Spec Z[T,,...,T,] > +1 because there is a chain 


(0) Cc (p) Cc (p, T1) iS (p, Ti, T2) (rere ae (p,T1,T2,.--,Tn). 


Actually, in both cases the strict equality holds. 

Passing to the closures instead of the points themselves, one can say that 
this is a variant of the old “definition” of dimension due to Euclid: points are 
boundaries of curves, curves are boundaries of surfaces, surfaces are bound- 
aries of solids. 

Arithmetical intuition is greatly enhanced when one considers rings of 
arithmetical type (that is, quotient rings Z[T),...,T,]/P) and their spectra 
as analogues of algebraic varieties over fields. 
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This is in the spirit of the general analogy between numbers and functions. 
For example, integral extensions of rings correspond to coverings of complex 
varieties, in particular Riemann surfaces. More precisely, let y : RC S be 
an integral extension, so that S is a finitely generated R-module. Then the 
corresponding contravariant map “y : Spec(.S) — Spec(R) is surjective, and 
its restriction to the subset Spm(S) of maximal ideals (closed points) is also 
surjective (cf. [Sha88]). 

For x € Spec(R), the fiber (“y)~!(x) can be described as Spec($/p(px)S). 
The structure of the fibers over closed points is described by a decomposition 
theorem. In particular “y is called unramified at « € Spm(R) if S/y(pz)S has 
no nilpotents, and is therefore a direct sum of fields. 


Example 5.4. Figure 5.1 depicts Spec(Zi]) as a covering of Spec(Z) (cf. 
[Sha88]). The generic point w’ of Spec(Z|[2]) projects onto the generic point w 
of Spec(Z). The other points are closed. A closed point of Spec(Z) is essentially 
a prime p. The fiber (“y)~1((p)) consists of the prime ideals of Z[i] dividing 
p. They are principal. There are two of them if p = 1(mod 4); otherwise there 
is one. Only 2 is ramified (of multiplicity two). 


(2+i) (3-27) 

Spec Z[i] — | 
ai) @) oy Gen) 

Spec Z o———_o——_0——__0——_0——_o—-  —el 
(2) (3) (5) (7) (11) (13) 


Fig. 5.1. 


Notice that Spec(Z) and Spec(Z[2]) are one-dimensional (as are algebraic 
curves). More precisely, Spec(Z) should be thought of as being an analogue 
of the affine line, that is, the projective line minus one point. (We shall later 
explain how one “compactifies” Spec(Z) by adding the arithmetical infinity). 
This analogy can be illustrated by two deep theorems of algebraic number 
theory. The first is Minkowski’s theorem that Q has no proper unramified 
extensions. The second theorem is Hermite’s theorem that Q (or any finite 
extension of Q) has only a finite number of extensions with given ramification 
points and bounded degree. 
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These arithmetical facts have their geometric counterparts in the theory 
of Riemann surfaces: the Riemannian sphere has no non-trivial unramified 
coverings, and the number of coverings (up to isomorphism) of a given compact 
Riemann surface X, which are unramified outside of a given finite set of points 
and have a fixed degree, is finite. To prove these statements, one can use the 
following formula due to Hurwitz. Let f : Y — X be a covering of Riemann 
surfaces; gx,gy their genera and ep the ramification index of f at a point 
PeéY. Then 


2Qgy — 2 =deg(f)(2gx —2)+ 4° (ep — 1). (5.1.1) 
PEY 


Alongside this one uses an explicit description of the fundamental group 
mi(X\S) of a Riemann surface with a finite set of points S removed. This 
group has only finitely many subgroups of a given index. 

A more sophisticated version of this analogy (dealing with algebraic curves 
over number fields instead of finite extensions) was developed by I. R. Shafare- 
vich in his Stockholm ICM talk (cf. [Sha62]). The finiteness conjectures stated 
in this talk prompted a wealth of research which eventually lead to the proof 
of all these conjectures as well as the Mordell conjecture on the finiteness of 
the number of rational points on any curve of genus g > 1 over a number field 
([Fal83], see also §5.5). 


5.1.9 Schemes 


The notion of a scheme is basic to algebraic geometry. An affine scheme is 
essentially a pair (Spec(A), A), where A is a commutative ring. More precisely, 
it is a topological space Spec(A) = X, endowed with a sheaf of local rings Ox 
whose ring of sections over an open set D(f) is A[f~+]. A general scheme X is 
a topological space X with a structure sheaf Ox such that (X,O,x) is locally 
(in a neighbourhood of each point) isomorphic to an affine scheme (see [Ha77], 
[Sha88]). 

Schemes form a category. Morphisms of affine schemes are defined to corre- 
spond bijectively to the homomorphisms of the commutative rings. Morphisms 
of schemes are given by such homomorphisms locally. 

For a commutative ring K, one can define a K—scheme as a morphism X — 
Spec(K’). In the category of K-schemes, morphisms should be compatible 
with the structural morphisms to Spec(A). Every affine scheme defining X 
has locally a canonical structure as the spectrum of a K—algebra. 

A scheme is called irreducible if its topological space is irreducible, i.e. if 
it is not a non-trivial union of two closed subspaces. 

We shall say that X is a scheme of geometric type if it can be covered by a 
finite number of spectra of rings of finite type over a field Kk. Similarly we say 
that X is a scheme of arithmetic type if it can be covered by a finite number 
of spectra of rings of finite type over Z. 
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These two classes have a non-empty intersection consisting of geometric 
schemes over finite fields Fy. They were and are a standard testing ground 
for various conjectures in which geometric and arithmetical intuitions are 
combined. We shall repeatedly turn to this class of schemes. In particular, 
if X — Spec(Ox) is a scheme of arithmetic type over the ring of integers 
Ox of a number field kK, we can define for every prime ideal p C Ox the 
reduction X mod p. This is a scheme over the finite field Ox /p. Considerable 
arithmetical information concerning X is encoded in the set all reductions 
X mod p. 

For a scheme X of one of these types, dim X is defined to be the maximal 
length of a chain 

L06-C LCs he Lys Zi F Zia; 


consisting of irreducible subspaces of X. If X is itself irreducible, with generic 
point x having residue field R(x), then dim X coincides with the so-called 
Kronecker dimension of R(x), that is, the transcendence degree of R(x) over 
its prime subfield, enlarged by one if Char R(x) = 0. In particular, 


dim Az = dim Spec Z[x1,...,¢%] =n+1. 


Example 5.5. The projective space P% over a ring K Consider the poly- 
nomial ring S = K[To,...,T,] graded by total degree S = ®g>o0Sq. Put 
Si = @asoSa. This is a graded ideal. Define Proj(S') to be the set of all 
homogeneous prime ideals of S which do not contain S_. It is a topological 
space, whose closed subspaces are the sets 


V(a) = {p € Proj(S) | p> a} 


where a is a homogeneous ideal of S. In order to turn Proj(.$) into a scheme, 
put 


We can identify Spec(A;) with an open subset of Proj(.S) in such a way that 
Spec(A;) M Spec(A;) = Spec(Aj;). 


The structure sheaves can also be glued together in a coherent way. As a 
result, Proj(A) becomes a K-scheme P?, which is called the projective space 
over K. 


5.1.10 Ring-Valued Points of Schemes 


Let X — Spec(K) be a K-scheme and L a K-algebra. We define an L— 
point of X (over kK) to be a morphism Spec(L) — X over K. Denote the set 
Morx (Spec(L), X) of L—points by X(L). If L is a field, we call these points 
geometric. 


198 5 Arithmetic of algebraic varieties 
Example 5.6. a) Let X = AZ, K =Z, L =F,. Then an L-point 
ZT, eae Th] = Fy 


is an n-tuple 
(1, ee itn) E Ree 


Hence Card X(F,) = q”. 

b) Let X = PZ, K = Z, L = Z/NZ. An element of X(Z/NZ) is a class of 
(n + 1)-tuples (to : ...: tn) € (Z/NZ)"*+ such that at least one of the 
coordinates is invertible. Two tuples are equivalent iff their coordinates 
differ by a common factor in (Z/NZ)*. The i'® coordinate is invertible 
precisely when the point lies in Spec(A;) (cf. §5.1.9). It is not difficult to 
count the total number of Z/NZ-points: 
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5.1.11 Solutions to Equations and Points of Schemes 


Solving a Diophantine equation or a system of equations is the same as finding 
a point in a scheme of arithmetic type. In fact, a family of polynomials over 
aring K, 

Fyi(T1,..-,Tn) € K[T,...,Tr] GED 


generates an ideala C K[T,,...,7,,] and for any K—algebra L, the L—points of 
the affine scheme Spec(K[T1, ..., Tn]/a) correspond bijectively to the solutions 
of F; =0in L”. 

If the F; are homogeneous then we may consider the corresponding pro- 


jective scheme 
Proj(K[Ti, esis ,T;,]/) 


and its points. For a general algebra LD, the relation between D—points and 
solutions here is somewhat complicated. For example if L is the ring of integers 
in an algebraic number field, then the set of L—points of PF is related to the 
ideal class group of L. However when L is a field, the Z—points correspond to 
non-zero [-—valued solutions upto a homogeneity factor. 

Projective space over a field can be obtained from the affine space by 
adding the hyperplane at infinity. Intuitively the transition to projective 
schemes is a kind of compactification. For this reason, projective schemes 
and varieties possess many nice geometric properties which play an important 
role in arithmetical investigations. 
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5.1.12 Chevalley’s Theorem 


Theorem 5.7. Let X be a subscheme of P%, over a finite field K = Fy defined 
by an equation F(Tpo,...,T,) = 0, where F is a form of degree d and n > d. 
Then the set X(Fq) (of projective solutions) is non-empty. 


cf. ([BS85], [War36]). 

Denote by Nr the number of solutions of F' = 0 in ete i.e. the number 
of F,-points of the corresponding affine scheme. We shall prove that p|Np 
where p is the prime dividing q. Since F'(0,...,0) = 0, this shows that there 
must also be a nonzero solution. 


Obviously 1 — F(T)2~! is equal to 1 € Fy at the points of (the cone over) 
X and 0 elsewhere. Therefore, 


Np=Npmodp= SY) (1-F(t)™). (5.1.3) 
tEAn+t1 (EF) 


We now expand the right hand side of (5.1.3) into a sum of monomials. Most 
of these will add up to zero. More precisely, 


yo te = 0 (5.1.4) 


tery? 


unless all the 7, are non-zero and are divisible by g — 1. This can be checked 
for n = 0 directly, and then for general n by expanding the sum in (5.1.4) into 
the product of n+ 1 factors. 

If a monomial 7j°...T» appears in the expansion of 1 — F(T)‘~1, 
then necessarily 7; < q-— 1 for at least one j; otherwise we would have 
(q — 1) deg F(T) > n(q — 1) contradicting the assumption that d < n. Hence 
finally Np = 0, so p|Np. 


5.1.13 Some Geometric Notions 


In this subsection, we shall briefly review some notions of algebraic geometry 
over fields, which will be used later. For a detailed treatment we refer the 
reader to the volumes of this series devoted to algebraic geometry. 


i)Irreducible components. Every K-variety (a geometric scheme over a field 
K without nilpotent elements in its structure sheaf) is a finite union of its 
irreducible components. After a finite algebraic extension of the base field 
kK, an irreducible variety may become reducible (its components may form 
an orbit with respect to the action of an appropriate Galois group). A va- 
riety which remains irreducible after any algebraic extension of the ground 
field, is called absolutely (or geometrically) irreducible. Each irreducible 
variety has a well defined dimension. 
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ii) Singular points. A point « € V can be singular or non-singular (regu- 
lar). Amongst the many equivalent definitions of regularity, the following 
is probably the shortest: x is regular iff the completion of the local ring 
O,, (with respect to the m,—adic topology where m, is the maximal ideal) 
is isomorphic to a ring of formal power series over k(x) = O,/m,. The 
regular points form a Zariski open subset of V. If V is given by a ho- 
mogeneous equation F'(x1,...,%n) = 0 in a projective space, one can ob- 
tain additional equations for the subvariety of singular points by putting 
OF (x)/Ox; = 0. 

Intersection points. A point of intersection of two irreducible components is 
always singular. 

Genus and singular points. The existence of singular points can drastically 
change both the geometry and the arithmetical properties of a variety. 
For example, a non-singular cubic curve in a projective plane has genus 
one; its set of rational points over, say, Q is quite small (cf. §5.3 below). 
When such a curve acquires a double point, the genus of its non-singular 
model becomes zero, and its set of rational points becomes much larger. 

iii) Embeddins and heights. A variety V given abstractly by an affine atlas and 
gluing rules may or may not be embeddable in a projective space. A vari- 
ety which is given as a subvariety of a projective space admits in general 
many more inequivalent embeddings. A choice of such an embedding (if it 
exists at all) is an extremely important additional structure. In geometry, 
it allows one to use various induction techniques (fibration by hyperplane 
sections etc.). In algebra, it governs most of the sheaf cohomology calcula- 
tions via various finiteness and vanishing results. In arithmetic the choice 
of an embedding leads to the notion of the height of a rational point, which 
is used in most of the quantitative problems of the Diophantine geometry. 

Divisors and Invertible sheafs. We therefore say a few words about divisors 
and invertible sheaves, the universally used geometric notions which gen- 
eralize the ideas of a hyperplane sections and a projective embeddings. 

Cartier divisors. Let V be a variety. A (Cartier) divisor on V is given in an 
affine atlas V = UU; by a family of elements {f;}, where f; is a rational 
function on U;. On the intersection U; 1 U;, we require that fi; = ui; f; 
for some regular, regularly invertible function u,;;. Two families { fi}, {g:} 
determine the same divisor if f; = u,g; for all 7, where u; is a regular and 
regularly invertible function on U;. The divisors form a group Div(V) un- 
der the natural composition: { f:}{g:} = {figi}. Every hyperplane section 
is a divisor. If all the {f;} are regular, the divisor is said to be effective. 

Picard group. An invertible sheaf on V is a locally free, one dimensional Oy— 
module £. The set of all such sheaves upto isomorphism forms a group 
Pic(V) with respect to the tensor product. Every divisor D defines an 
invertible sheaf O(D): its sections over U; can be identified with elements 
of f;Ou,. Vice versa, a meromorphic section of £ defines a divisor D and an 
identification £ Y O(D). In this way, we have a surjective homomorphism 
Div(V) > Pic(V). 
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Ample sheaves. A projective space has a canonical invertible sheaf O(1). Each 
morphism y : V — P” determines the invertible sheaf £ = y*(O(1)). The 
sheaves £ obtained from the closed embeddings ¢ are called very ample. 
L£ is called ample if some positive power of it is very ample. 

iv) Canonical sheaf. If V is non-singular, one can define the locally free Ox- 
module of 1-forms Qk whose rank is d = dim(V). Its d‘™ exterior power 
wy is called the canonical sheaf of V. Its numerical properties have a very 
strong influence on the arithmetical properties of V (cf. the next section). 
For V = P” we have wy = O(—n — 1), so wy" is ample. Simultaneously 
the set of rational points could not be larger. When wy becomes ample, 
one conjectures that most rational points are concentrated on a proper 
Zariski closed subvariety. 


5.2 Geometric Notions in the Study of Diophantine 
equations 


5.2.1 Basic Questions 


Consider a finite system of polynomial equations over Z. As was explained in 
85.1, such a system defines an arithmetic scheme X, its set of integral points 
X(Z) and sets X(L) for more general rings L, for example, rings of integers 
O of algebraic number fields. 

Let X be a smooth projective algebraic variety over a number field K with 
the maximal order O = Ox. In this case the K-points of X coincide with its 
O-points, so we shall speak about X(K) rather than X(Q). 

In number theory, one is interested in properties of K-rational points X (K) 
on X. In algebraic geometery one studies the properties of X (C) considered as 
a topological space, analytic manifold, or algebraic variety (or, more generally, 
one studies X (ZL) for various algebraically closed fields L). Geometric methods 
in the theory of Diophantine equations are used in order to relate the geometry 
of X(C) to the arithmetical properties of X (Ic). 

The relevance of such methods is most evident for congruences, or, more 
generally, varieties over finite fields. A. Weil in his famous note (cf. [Wei49]) for- 
mulated several conjectures concerning the numbers of points of such schemes 
and suggested that there should exist a cohomology theory in finite charac- 
teristic such that a Lefschetz type theorem in this theory would imply (a part 
of) these conjectures. A.Grothendieck and his collaborators developed such a 
cohomology theory, and P.Deligne accomplished the realization of Weil’s pro- 
gramme by proving the Weil-Riemann conjecture in full generality. In Chapter 
6, §6.1 we briefly describe these results. 

In this section we survey some known connections between geometry and 
arithmetic over number fields. 


A) Is X(K) non-empty? 
B) Is X(K) finite or infinite? Is it dense in X? 
C) If X (Kc) is infinite, what is the order of growth of 


N(H; B) := Card {x € X(K)|H(z) < B}? 


Here H is a (exponential) certain “height” function, e.g. in fixed coordi- 
nates, for X Cc P”, 


H(2o,..-,%n)= |] max;(|zil.). 


vEVal(K) 


D) Can one, at least in some sense, describe the set X(K) as a finitely 
generated structure? 
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For any of these questions one may also be interested in algorithmic so- 
lutions. Matiyasevich’s theorem is, however, a strong indication that one will 
not be able to answer these questions for all varieties. Instead, one could try to 
prove conditional statements of the type “if X(C) has such-and-such geomet- 
ric properties (is a one-dimensional irreducible non-singular variety, projective 
algebraic group, flag space ...), then X(K) has such-and-such arithmetical 
properties (is finite, finitely generated; N(H; B) grows as a power of B...)”. 
One expects that in the stable range, allowing for a finite extension of K and 
restricting to a Zariski open subset U of X, there is a relation between the 
set of rational points on U and geometric invariants of X. 

Below we shall briefly discuss some results of the latter type, grouping 
them around questions A) — D). 


5.2.2 Geometric classification 


One of the main geometric invariants of a smooth projective variety X is 
its canonical class Ky (see §5.1). Algebraic varieties can be classified, very 
roughly, according to the ampleness of the anticanonical class —K x, resp. 
Kx. Varieties with ample —Kx are called Fano, with ample Kx - varieties of 
general type and intermediate type varieties, otherwise. In finer classification 
theories, and in many arithmetic applications, one has to allow “mild” singu- 
larieties and to introduce further invariants such as Kodaira dimension, cones 
of effective and ample divisors etc. 

In dimension one the above classification coincides with the topological 
classification of Riemann surfaces: genus 0, > 2, resp. 1. Fano varieties in 
dimension two are called Del Pezzo surfaces. Over an algebraically closed 
field, these are: 

P?,P! x P!, Sq 


where Sy is the blowup of P? at 9 — d points, and the degree d = 1,...,8. 
Surfaces of intermediate type include: abelian surfaces and their quotients, K3 
surfaces and Enriques surfaces. The classification of Fano varieties in dimen- 
sion three was a major achievement by Iskovskikh and Mori—Mukai, cf. [Isk77], 
[Isk78], [MoMu84], [MoMu03], completing the work of the Italian school, no- 
tably G. Fano. Examples are cubics, quartics or double covers of P? ramified 
in a surface of degree 6. Interesting three-dimensional varieties of intermedi- 
ate type are Calabi-Yau threefolds. One knows that in every dimension, the 
number of families of Fano varieties is finite. 

Fano varieties are, in some sense, similar to projective spaces. As we have 
seen, Del Pezzo surfaces over C are birational to P?. Generally, Fano varieties 
have the following properties, quite important for arithmetic applications: 
through every point in X there is a rational curve of anticanonical degree 
< dim X + 1, defined over C, and any two points can be connected by a chain 
of rational curves. However, it is unknown whether or not all Fano threefolds 
are dominated by a projective space. 
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5.2.3 Existence of Rational Points and Obstructions to the Hasse 
Principle 


Let X be an algebraic variety over a number field AK’. An obvious necessary 
condition for X(K) to be non-empty is that X(K,,) 4 0, for every completion 
K, of K. If this condition is also sufficient we say that X satisfies the Hasse 
(or Minkowski-Hasse) principle. 

Using the circle method one can prove the Hasse principle for complete 
intersections in projective spaces whose dimension is sufficiently large with 
respect to the degree. B.J. Birch (1962) has proved the following general result. 

Let X C P"~! be given by r equations. Assume that the dimension of the 
subvariety of singular points of X is less than 


n-1—r(r+1)(d—1)2%". 


Then X satisfies the Hasse principle. In particular, it holds for 


a) quadrics of dimension > 3 (with number of variables n > 5); 
b) intersections of two quadrics of dimension > 10 (n > 13); 
c) cubic hypersurfaces of dimension > 15 (n > 17). 


One conjectures that this is true for n > 9 in case b) and n > 10 in case c); 
this last conjecture was proved, over Q, by Ch. Hooley (cf. [H88]). 

The best known results for the case b) are due to J.-L. Colliot-Théléne, 
J.-J. Sansuc, and P. Swinnerton-Dyer. 

Of course, the case of quadrics is classical. For cubic forms in 3 and 4 vari- 
ables the Hasse principle may fail. A conceptual approach to higher obstruc- 
tions to the existence of rational points was proposed in [Man70a], [Man72b]. 
It is based on the Hasse-Minkowski principle for the Brauer group over a num- 
ber field (see in §4.5 of the previous Chapter, the exact sequence (4.5.43)), 
and Grothendieck’s generalization of the Brauer group for schemes. One has 
the following diagram 


Br(X) —~> 6,Br(X,) 


a. 
~, invy 


0 ——~ Br(K) ——~ 6,Br(kK,) ————> Q/Z ——>0 


In detail, if X is a scheme over a field K, an element a € Br(X) is represented 
by a family of semi-simple algebras parametrized by X. In particular, for 
any extension field L > K and an L-point x € X(Z), one has a natural 
specialization a(x) € Br(Z), with obvious functorial properties. Assume that 
X(K,) 4 @ for all v and that for every (ry), € X(A), where A is the adéle 
ring of K, there exists an a € Br(X) such that 


a inv, (a(vy)) #0. 


Uv 
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Then (#,), cannot belong to X(A) and one says that X has a non-trivial 
Brauer—Manin obstruction to the Hasse principle. 

One of the simplest examples in which the Brauer—Manin obstruction is 
non-trivial, is furnished by the projective cubic surface X over Q: 


3 
z(a@+z)(a+2z) = [[@ + Oy 4+ 922), 
i=1 


where 0 are the three roots of 
6°+7(8+1)? =0 


(this example is due to Swinnerton-Dyer). Its set of adélic points is non-empty. 
A local analysis shows that one can construct two elements a1, a2 of the 
Brauer group of this surface with the following properties: 


i) if v #7, the local invariants of a;(x,) vanish for every 7, € X(Q,); 
ii) for every x7 € X(Qz), either inv7(ai(x%7)) A 0, or inv7(a2(x7)) £ 0. 


Hence the Hasse principle fails for this surface. 

J.-L. Colliot-Théléne, J.-J. Sansuc and D. Kanevsky have compiled a table 
of diagonal cubic surfaces ax? + by? + cz? + du? = 0 with integral coefficients 
in the range [—500, 500) having rational points everywhere locally, for which 
the Brauer—Manin obstruction vanishes. A computer search has shown that 
all these surfaces have rational points. One might therefore conjecture that 
the vanishing of this obstruction implies the existence of a rational point for 
all diagonal cubic surfaces, or perhaps all non-singular cubic surfaces, or even 
all non-singular rational surfaces (i.e. those admitting a birational parame- 
trization by two independent parameters over C). This conjecture has been 
proved for the so called generalized Chatelet surfaces given by an equation of 
the form y? — az? = P(x), where a is not a square and P is a polynomial of 
degree three or four. 

The Brauer—Manin obstruction has been thoroughly investigated for three 
classes of varieties: 


i) rational surfaces; 
ii) principal homogeneous spaces of linear algebraic groups, especially alge- 
braic tori; 
iii) principal homogeneous spaces of elliptic curves and more generally Abelian 
varieties. 


Historically, iii) was the first example. However, it appeared in a different form 
in the theory of the Shafarevich—Tate group, whose classical definition will be 
given in the next section. The connection with the Brauer—-Manin obstruction 
is explained in [Man70al]. 

J.-L. Colliot-Théléne and J.-J. Sansuc have developed a geometric version 
of this obstruction, which is called the descent obstruction. 
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Assume that for a variety X over K we have somehow constructed a family 
of dominating morphisms f; : Y; + X such that X(K) = U fi(¥i(K)). Then 
one can establish that X (A) is empty by showing that for each Y; there exists 
a completion AK, ;) such that Y;(,(;)) = 0. On the other hand, if X(K) is 
non-empty, and the Y; are in some sense simpler than X, e.g. rational, one 
obtains an explicit description of the set X (IK). 

Colliot-Théléne and Sansuc have developed a systematic way of construct- 
ing such families, based on the notion of a torsor. They have shown that 
for non-singular rational varieties these descent families have the following 
properties: 


a) The descent obstruction vanishes iff the Brauer—-Manin obstruction van- 
ishes. 
b) The Brauer—Manin obstructions for the descent varieties Y; vanish. 


Using the machinery of torsors, Skorobogatov cf. [Sko99] constructed an 
example of a surface with trivial Brauer—Manin obstruction and not satisfying 
the Hasse principle. The surface in question has a nontrivial fundamental 
group and the obstruction may be interpreted via non-abelian descent (see 
also [Sko01] and [HaSk02]). 


5.2.4 Finite and Infinite Sets of Solutions 


Once it is established that the set of rational points X(A) on an algebraic 
variety X over a number field K is not empty one could ask whether this set 
is finite or, for example, dense in X. First of all, let us describe the results in 
the case of smooth projective curves: 


i) Let X be a curve of genus zero. The X satisfies the Hasse principle. More 
precisely, X can be given by a homogeneous quadratic equation in P7: 


aX?+bY?+cZ? =0 


and the local conditions can be checked algorithmically. If X(K) 4 @ then 
X is isomorphic to Pi, so that X(K) = K U {oo} is Zariski dense in X. 
ii) If X is of genus 1 then X(K) can be empty, finite or infinite. Even over Q, 
one does not know a provably correct algorithm allowing to distinguish 
between these cases. However, there are algorithms that work in practice. 
In [Man71] an algorithm was suggested to answer the finite/infinite ques- 
tion when it is known that X(K) is non-empty. If one assumes certain 
general conjectures on elliptic curves (the Birch-Swinnerton-Dyer conjec- 
ture and the Shimura~—Taniyama-Weil conjecture, cf. the next section, and 
[Man71]) then one can deduce the correctness of this algorithm. Moreover, 
X(L) always becomes infinite over an appropriate finite extension of K. 
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iii) If X is of genus > 1 then X(K) is always finite. This is the famous Mordell 
conjecture, proved by G. Faltings. For more details see the following three 
sections. 


Note that this rather distinct arithmetic behaviour is well aligned with the 
classification of one-dimensional algebraic varieties recalled in Section 5.2.2. 
One hopes that this property persists in higher dimensions as well. Bombieri, 
for surfaces, and Lang—Vojta, in general, conjectured that rational points on 
varieties of general type are always contained in proper subvarieties, i.e., they 
are never Zariski dense. If true, this conjecture would have remarkable conse- 
quences: non-uniqueness of the Brauer—Manin obstruction for general hyper- 
surfaces [SarWa], uniform (in terms of A’) upper bounds for the number of 
rational points on curves of genus g > 2 etc., cf. [CHM97]. 

One may ask for a converse to this conjecture. Note that some care is 
necessary. First of all, one has to allow finite extensions of the ground field 
(already conics may have no rational points at all). Thus we ask for poten- 
tial density, i.e., Zariski density after a finite extension of the ground field. 
Secondly, X may not be of general type while admitting an étale cover which 
dominates a variety of general type. In this case, rational points on X cannot 
be Zariski dense. In dimension two, rational points are potentially dense on 


i) Del Pezzo surfaces; 
ii) abelian, Enriques and K3 surfaces with an elliptic fibration or an infinite 
automorphism group [BoTs99]. 


All Fano threefolds, except double covers of P? ramified in a surface of degree 
6, are known to satisfy potential density [HaTsch], [BoTs2000]. 

To get an idea how these results are proved, assume that there is a non- 
trivial rational map f : C — X, where C is a curve of genus 0 or 1 with C(K) 
infinite. Then, of course, X(/) is also infinite. Families of such embedded 
curves can often be constructed by geometric methods. 


Examples. 


a) Every a € K* can be represented in an infinite number of ways as a sum 
of three cubes in K. In fact, one representation is given by the identity 


1 3 
a= (eae aa =) ((a° = 3°)? (er Ba 38) ea 9a)? ), 
The geometric picture is as follows: For any non-singular cubic surface X 
and any point z € X(K), denote by C, the intersection of X with the 
tangent plane to X at x. If x does not belong to a line in X then C, is a 
plane cubic curve with a double point at x. Hence it has genus zero and 
a rational point (cf. Part I, §1.3). (This argument must be modified in 
certain degenerate cases). 
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b) Euler conjectured in 1769 that the equation 
ityt+z4=ut (5.2.1) 


has no non-trivial integral solutions. This conjecture was disproved by 
N.D. Elkies (1988). He found a solution 


26824404 + 153656394 + 187967604 = 206156734 


and proved that there are in fact infinitely many solutions by constructing 
an elliptic curve lying on (5.2.1) with infinitely many points. Potential 
density of rational points on this quartic, and in fact on any smooth quartic 
surface containing a line, has been proved in [HaTsch], [BoTs99]. Let us 
sketch the geometry behind this result: let X be a quartic surface with 
a line @ and consider the one-parameter family of hyperplanes P? Cc P® 
containing ¢. For each P? in this family, its intersection with X is a curve 
of degree 4 containing ¢. The residual curve to @ is a plane curve of degree 
3 intersecting @ in three points. Thus X admits a fibration over P! with 
generic fiber a curve of genus 1 and a rational trisection ¢. Generically, this 
implies that rational points on X are Zariski dense already over the ground 
field. In some degenerate situtions one has to pass to a finite extension to 
insure Zariski density of rational points. 

c) Of course, we can find even more points on X if we manage to construct 
maps P” + X or A — X, where A is an Abelian variety with large 
A(K) etc. Many geometric methods for such constructions are known. 
For example, the diagonal quartic threefold 


at+yt+z+tt+ut=0 


is geometrically unirational, i.e., dominated by P®. It is unknown whether 
or not every smooth quartic threefold is unirational. 

d) Here is another general method of proving that X() is infinite: if X 
has an infinite automorphism group G, an orbit Ga of a point x can be 
infinite. Examples of K3 surfaces with infinite automorphism groups are 
hypersurfaces of degree (2,2,2) in P! x P! x P!. 


5.2.5 Number of points of bounded height 
Let us start with a heuristic argument. Consider a system of equations 
F,(ao,.--,%n) =0, t=1,...,7, (5.2.2) 
where F; is a form of degree d; with integral coefficients. Put 
N(B) = Card {(20,...,2n) € Z"*" | H(z) := max(|z;|) < B}. 


To guess the order of growth of N(B), we may argue as follows. First note 
that there are about B”*! points in Z"*+ whose heights are < B. Secondly 
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F; takes roughly B“ values at these points. Assuming that the probability of 
taking the zero value is about B~“, and that these events are independent 
for different 2’s, we get 


N(B) & Brthed a, (5.2.3) 


The power on the right hand side of (5.2.3) has a nice geometric interpreta- 
tion: if the projective variety X defined by (5.2.2) is a non-singular complete 
intersection, then its anticanonical sheaf —Ky is given by the following for- 
mula: 


Kx =O(n+1 S> di), 


where O(1) is induced on X by Opn(1). Hence we can reformulate (5.2.3) in 
a more general and a more cautious way, taking into account various counter- 
examples to the over-optimistic formulation (5.2.1): we expect the order of 
growth of N(B) to be B*, a > 0 when —K‘x is ample and O(B*) for any 
€ > 0 when Ky is ample, if one deletes from X some “point-accumulating” 
subvarieties, and if one passes to a sufficiently large ground field. 

These conjectures were stated in a precise form by V.V. Batyrev and Yu.I. 
Manin. We shall add some comments without going into much detail. 


a) To obtain a stable picture, allowing to involve geometric notions and con- 
structions, we must pass to finite field extensions. 

b) We should consider counting problems with respect to arbitrary invertible 
sheaves, not only —K x. The latter could fail to be ample, for example, it 
could be zero. 


The first step in this program is the theory of heights, going back to an 
old construction of A. Weil. 

Let X be a projective algebraic variety over a number field K and £ an 
invertible sheaf on X. Consider all completions K, of kK. Denote by |-|, : K > 
R the local norm which is the scaling factor of an additive Haar measure with 
respect to multiplication by elements of K,. We have the classical product 
formula [],, |x|, = 1 for all « € K™*. If A is a one-dimensional vector space 
over K, || - ||,: 4 — R denotes a norm such that || a ||y= lal. || A ||, for all 
a € K and A € A. The invertible sheaf £ can be considered as a family of 
one-dimensional spaces parametrized by X, and one can define an admissible 
metrization as a family of metrics || - ||, for all v, on each fiber of £, with 
natural continuity properties (cf. Lang S. (1983)). Given such a metrized sheaf 
L = (£,|| - |v), the height with respect to it is a function Hy : X(K) — R 
defined by the following formula: 


Hy (2) := [TI s(#) IIo", (5.2.4) 


where s is a local section of £ not vanishing at x. (Its choice is irrelevant due 
to the product formula). 
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For a list of properties of heights, we refer the reader to Lang S. (1983). 
We mention only the following ones: 


i) Up to a function of the type exp(O(1)) Hz does not depend on the choice 
of metrization and is multiplicative in £. We shall therefore write He 
instead, if we are interested only in questions invariant with respect to 
such choice. 

ii) If 2 is ample and U C X is a Zariski open subset then the number 


Nu (£; B) := Card {x € U(K) | He(x) < B} 


is finite for every B. 
iii) We have 


Np» (—Kx; B) = cB(1 + 0(1)) (5.2.5) 


for all n > 0 and number fields K (this is Schanuel’s theorem, Schanuel S. 
(1979)). 


A natural generalization of (5.2.5) is the following 


Conjecture 5.8 (Linear Growth — first version). Let X be smooth, with ample 
—Kx, and let r denote the rank of the Picard group of X. Then there exists a 
sufficiently small Zariski open subset such that for all sufficiently large ground 
fields one has 

Nu(—Kx; B) = cBlog(B)"~1(1 + o(1)). (5.2.6) 


Clearly, the conjecture cannot be true for varieties without rational points 
or for cubic surfaces containing rational lines, since each such line would al- 
ready contribute about B? rational points to the asymptotic. These are the 
obvious necessary conditions. 

Thus, if X is a cubic surface such that all 27 lines on X are defined over 
the ground field and U C X is the complement to these lines then 


Nu(—Kx; B) = cBlog(B)°(1 + o(1)). 


Lower bounds of this shape have been proved over Q in [SSw-D]. Non-trivial 
upper bounds are unknown. 
Now consider the variety X Cc P? x P? given by the (1,3)-form 


3 
doy; = 0 
7=0 


over a field K containing \/1. The projection to the z-coordinates exhibits X 
as a fibration over P? with generic fiber a cubic surface. A Zariski dense set 
of fibers corresponds to cubic surfaces with all 27 lines defined over Kk, each 
contributing Blog(B)® to N(B). However, the rank of the Picard group of X 
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is 2 and Conjecture 5.8 predicts Blog(B) points of —Kx-height bounded by 
B, leading to a contradiction [BaTsch98a]. A refined approach to the Linear 
Growth conjecture, taking into account such fibration structures, is explained 
in [BaTsch98b]. 

On the other hand, varieties closely related to linear algebraic groups do 
satisfy Conjecture 5.8, see its refinement by Peyre [Pey95] and its generaliza- 
tion to arbitrary ample line bundles in [BatMan] and [BaTsch98b]. Precise 
asymptotics, compatible with the above conjectures are known for: 


— smooth complete intersections of small degree (for example, [Bir61]); 

— split smooth Del Pezzo surface of degree 5 over Q [dlBre02]; 

— generalized flag varieties [FMTsch] 

— toric varieties [BaTsch98al]; 

— smooth equivariant compactifications of G/U - (horospherical varieties), 
where G is a semi-simple group and U C G a maximal unipotent subgroup 
[StTsch]; 

— smooth equivariant compactifications of G? [Ch-LT02]; 

— smooth bi-equivariant compactifications of unipotent groups [ShT04]; 

— wonderful compactification of some semi-simple algebraic group of adjoint 
type [ShT-BT 04a], [ShT-BT04b]. 


Very little is known for general higher-dimensional varieties. Geometric 
arguments, based on Mori’s theorem that every point on a Fano variety X 
lies on a rational curve of degree at most dim X + 1, imply that 


Ny (£; B) > cBhU.£) 


for any dense Zariski open subset U C X, sufficiently large K, and some 

positive constants c > 0, 6(U,L) > 0. Batyrev and Manin state conjectures 

about the best possible values of 3(U, LZ) and relate them to Mori’s theory. 
Further developments are reflected in the book [PeyTsch01]. 


5.2.6 Height and Arakelov Geometry 


S.Yu.Arakelov (cf. [Ara74a] and §III.2) had the brilliant idea of considering 
Hermitian metrizations of various linear objects related to algebraic varieties 
such as invertible sheaves, tangent bundles etc., in order to compactify arith- 
metic schemes over number fields at the arithmetical infinity. In particular, 
each curve has a well defined minimal model over O which is called an arith- 
metical surface (since we added an arithmetical dimension to the geometric 
one). Adding metrics at infinity to this, Arakelov developed the intersection 
theory of arithmetical divisors. Heights in this picture become the (exponen- 
tiated) intersection index, see [Ara74b], [La88]. 

This theory was vastly generalized by H.Gillet and C.Soulé [GS91], [GS92], 
[SABK94], following some suggestions in [Man84]. 

Figure 5.2 is a visualization of a minimal arithmetical surface (this notion 


was defined and studied by I.R.Shafarevich (cf. [Sha65], [Sha66]). Its fibers 
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over the closed points of Spec(Q) can be non-singular (“non-degenerate”, or 
with “good reduction”) or singular (having “bad reduction”). Rational points 
of the generic fiber correspond to the horizontal arithmetical divisors; there 
are also vertical divisors (components of closed fibers) and “vertical divisors at 
infinity” added formally, together with an ad hoc definition of their intersection 
indices with other divisors defined via Green’s functions, (see §III.2). 


x € X(K) 
at as fibre 
zs>—--~o. 


SpecO + vy 
% Spec K 
(Generic point of Spec 0) 


Fig. 5.2. 


Arakelov’s picture and the theory of heights played a prominent role in 
Faltings’ proof and the subsequent development of his work. We postpone 
a more detail discussion of Arakelov geometry and its relation with Non- 
Commutative geometry to Chapter 8. 


5.3 Elliptic curves, Abelian Varieties, and Linear Groups 


5.3.1 Algebraic Curves and Riemann Surfaces 


An algebraic curve is a one-dimensional algebraic variety over a field kK. Usu- 
ally we shall tacitly assume it to be irreducible. Every algebraic curve can be 
obtained by deleting a finite number of points from a projective curve. For 
every projective curve C, there exists a non-singular projective curve C’ and 
a morphism C’ — C' which is an isomorphism outside of singular points of C. 
The curve C’ is called a (complete) non-singular model of its function field. 
It is uniquely defined (upto isomorphism) by this function field. 

The genus g of a projective non-singular curve C’ (and its function field) 
can be defined (or calculated) in many ways. Here are some of them: 


i) It is the dimension of the space I'(w) of regular differential 1-forms on C 
(the differentials of the first kind). 

ii) If K =C then g is the topological genus (the number of handles) of the 
Riemann surface C(C) of complex points of C. 

iii) Consider a projective embedding C' Cc P%. In general one can take n > 3 
but not n = 2: our curve may have no non-singular plane model. However, 
C always has a plane projection with only simple double points. Let d be 
its degree and v the (geometric) number of double points. Then 


The basic theorem on algebraic curves is the Riemann-Roch theorem. To 
state this theorem we require some definitions. 

Let D be a divisor on C. It has a degree deg(D): a Cartier divisor on 
a non-singular curve can be identified with a formal linear combination of 
(geometric) points, and the degree is the sum of the coefficients of this linear 
combination. 

Recall that each invertible sheaf £ is isomorphic to a sheaf of the type 
O(D) (cf. §5.1.13). Although D is not uniquely defined by CL, its degree is. 
We may therefore define deg(£) = deg(D). In particular deg(wc) = 2g — 2, 
where g is the genus of C. A divisor K = Kc such that w = O(K) is called a 
canonical divisor. A sheaf £ is ample iff its degree is positive. 

For a divisor D, put I(D) = dim I'(O(D)). The Riemann — Roch theorem 
for curves can be stated as follows: 


I(D) — l(K — D) = deg(D) —g +1. (5.3.1) 


5.3.2 Elliptic Curves 


We shall call a non-singular projective curve X of genus one with a non- 
empty set X(K) of K-points an elliptic curve. An elliptic curve has exactly 
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one (upto constant factor) differential of the first kind. The divisor of this is 
zero. In other words wx = Ox. From the Riemann-Roch theorem (5.3.1) it 
follows that 1(D) = deg(D) for deg(D) > 1. We can use this to show that i) X 
is an algebraic group; ii) X is isomorphic to a plane cubic curve. To prove i) 
choose a point o € X(K). For any two points z,y € X(K) let D=a+y-o. 
Since deg(D) = 1 we have I(D) = 1. It follows that there exists a unique (upto 
constant factor) function f whose divisor is x + y — 0 — z. Define x * y := z. 
One can check directly that * is a commutative group law on X(K) (with 
identity 0). Actually, one can ameliorate this construction in order to define 
the algebraic addition law which is a morphism « : X x X — X, verifying the 
standard axioms. 

To prove ii) choose a non-constant section f € £(20). Then f has a pole of 
order precisely two, since sections of £(0) are constants. Furthermore /(30) = 
3, so there is a section h € £(30) with a pole of order three at o. From f and 
h we can construct seven sections of £(60): 1, f, f?, f8, h, fh, fh?, whereas 
(60) = 6. Hence these seven sections are connected by a linear relation 


ag + arf t a2 f? t a3 f°? t boh t by fh t by fh? = 0. (5.3.2) 


Equation (5.3.2) defines a smooth affine cubic curve. Its projective com- 
pletion is a non-singular projective plane model Y of X. The identity point 
o € X(K) corresponds to the infinite point (0: 1:0) of Y, and the group law 
* becomes the law described in Part I, §1.3.2 in terms of secants and tangents. 

Making additional linear changes of variables we may reduce (5.3.2) to the 
following (Weierstrass) normal forms over a field K: 


y° + ayry + a3y = x? + agz” + asx + ag, 
where a1, 42,03, @4,a¢g € K and 
A = —b3bg — 8b — 27b2 + Ybababe F 0, 


where 
be = a? 4+ 4as, b4 = 2a4 + a143, be = a3 + dag. 


3 
c 
The notation j = — is used, where 


A 
c4 = 3 — 24b4, cg = —b3 + 36b2b4 — 216bg. 


Then this equation can be further simplified using the transformation x > 
ua +r, yr usy’ + su22'r +t in order to obtain the following (cf. [Ta73], 
[Kob87] : 


1) For p ¥ 2,3: 


y? = 2° + agz + ag with A = —16(4a3 + 2702) 4 0. (5.3.3) 
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2) For p = 2 we have the condition j = 0 is equivalent to a; = 0, and 
the equation transforms as follows: if a; 4 0 (i.e. 7 4 0), then choosing 
suitably r,s,t we can achieve a, = 1, a3 = 0, a4 = 0, and the equation 
takes the form 


y? + ay = 2 + aga 4+ ag, (5.3.4) 


with the condition of smoothness given by A ¥ 0. Suppose next that 
a, = 0 (i.e. j = 0), then the equation transforms to 


y’ + a3y = 2? + age + a6, (5.3.5) 


and the condition of smoothness in this case is a3 # 0. 
3) For p=3: 


y? = 2° + ax” + asx + ag, (5.3.6) 
(here multiple roots are again disallowed). 
The proper Weierstrass form (in the case (5.3.3)) is 
y? = 42° — gox — gs. (5.3.7) 
The discriminant 
A = g3 — 2793 (5.3.8) 


does not vanish. The coefficients gz and gg are defined upto the substitution 

4 6 . . . . 
g2 > u"ga, 93  U’g3s with u € K. The modular, or absolute invariant 7 of 
our elliptic curve is defined to be 


693 93 93 
j = 2°3° |, = 1728. (5.3.9) 
G3 — 2793 A 
Two elliptic curves have the same absolute invariant iff they become isomor- 
phic over an algebraic closure of the ground field K. The classical Weierstrass 
form (5.3.7) emerged in the theory of complex parametrizations of the complex 
elliptic curves. 
The Riemann surface E(C) of an elliptic curve E defined over C, is a 
complex torus, that is, a quotient C/A where A is a lattice 


A= {z=n1 + nat|n1,n2 € Z, Im(r) > Of. (5.3.10) 


The connection between this analytic description of F and an algebraic one is 
based on the identification of rational functions on FE with A—periodic mero- 
morphic functions on C, i.e. elliptic functions. 

Weierstrass considered the following basic functions: 
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ol) = (2.4) = 540 (4-4) (5.3.11) 
(prime denoting w # 0); 
(2) =9'(z,A)=-2)5 oF (5.3.12) 
wea 


These series converge absolutely outside A and define elliptic functions. The 
set of all elliptic functions with periods A forms a field which is generated over 
C by g(z) and g'(z). These two functions are related by the equation 


@! (z)? = 4(z)? — go@(z) — gs, (5.3.13) 
where 
/ 1 / 1 
g2 = 60 5° —q1 93 = 140 S- ae (5.3.14) 
wea wea 


Now if an elliptic curve E, C P2 is defined by the equation (5.3.7) with go 
and g3 from (5.3.14), we can define a map 


C/A > E,(C) (5.3.15) 


for which z+ (g(z): o’(z): 1) when z is not in A. The point 0 is mapped to 
the infinite point (0:1:0). 

The map (5.3.15) is a complex analytic isomorphism. In order to define its 
inverse, consider the differential of the first kind 


dx/y = dx//4x3 — gor — gs (5.3.16) 


on the Riemann surface E,(C). We integrate this form over a path joining a 
fixed initial point (say, o) with a variable point. 

The integral depends on the choice of path, but its image in C/A is deter- 
mined only by the endpoints. 

According to a classical theorem due to Jacobi, the discriminant A = A(r) 
of E, can be expressed via A = A, as 


A= (2n)?q |] A -4™)** = (22)? 5 r(n)a” (5.3.17) 
m=1 n=0 


for all r € C with Im(r) > 0, g = exp(2zir). The function r(n) is called 
Ramanujan’s function. Its first few values are 


r(1) =1, 7(2) =—24, 7(3) = 252, 7(4) =—1472. 


The absolute invariant of FE, is by definition 
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T 3 
ir) = ee 


=q +7444 5  e(n)q”, (5.3.18) 
n=1 


where c(1) = 196884, c(2) = 21493760,.... One can prove that j(7) takes all 
complex values, which shows that every elliptic curve over C is isomorphic to 
E, for an appropriate T. 


7 Mate 4 A b 
Two curves E,, E, are isomorphic iff r’ = weit for some matrix fe a) € 


SL(2, Z). In fact, a complex analytic isomorphism C/A > C/A’ is necessarily 
induced by multiplication by some u € C*. Therefore, A; = uA;’, so that 
(u, ut’) is a basis of A, and u = cr+d, ut’ = ar+b. The linear transformation 
is unimodular because (1,7’), (wu, u7’) and (1, 7) all define the same orientation 
of C. We therefore have 


g2(t') = u%ga(T), 93(t") = u®gs(7). (5.3.19) 


To sum up, isomorphism classes of elliptic curves over C correspond bijectively 
to points of the quotient space H/I’, where H is the upper half plane 


H = {r € C|Im(r) > 0}, (5.3.20) 
and the modular group 
I =SL(2, Z) (5.3.21) 


acts on H by fractional linear transformations. The isomorphism C/A, — 


ae: 
— 
~<. 


f 


Fig. 5.3. 


E,(C) is also compatible with the natural group structures. In terms of elliptic 
functions, this is reflected in the addition theorem for elliptic functions: 


(21 + 22) = —@(21) — (22) 4 7 (See) (5.3.22) 


218 5 Arithmetic of algebraic varieties 


In terms of the coordinates (x, y) satisfying (5.3.7), we have 


1 = 2 
v3 = —-21 — X24 (2 v2 ) ; 
4 v1 & 


where 
Pi =(21,u1), Po =(2,w2), P3 = Py * Po = (23,3). 


Topologically, C/A is a surface of genus one. It can be obtained from the 
parallelogram {u, + wet | 0 < uy, U2 < 1} by identifying the opposite sides 
(cf. Fig. 5.3). 

Points of finite order. Let E be an elliptic curve defined over a field K. 
For an integer N denote by Ey the kernel of the map which multiplies each 
point by N: 


Nz: E(K)— E(K), Na(t) = Nt. (5.3.23) 


If E is defined over C then the isomorphism C/A 2 E(C) shows that 


En & Z/NZ x Z/NZ. 


In fact Ey corresponds to the subgroup aA /AC C/A. For example 2-torsion 
points are represented by 0, 1/2, 7/2, (1+7)/2. It follows that (5.3.15) maps 
1/2, 7/2, (14+7)/2 onto (#;,0) for i = 1, 2, 3, where x; are the roots of the 
polynomial 4x3 — gga — g3. In other words, 


g' (1/2) = @'(r/2) = p'((1 + r)/2) =0, 


and 


Ax? — gor — g3 = 4(x — €1)(x — 2) (x — €3), 
where 
ey = (1/2), e2=@'(1/2), es = @'((1+7)/2). 
The 3-torsion points have a nice geometric interpretation: they are the 
points of inflection of the projective Weierstrass model. 


For any ground field K, the morphism Ng has degree N?, and if (char K, N) = 
1, we still have 


E(K)y &Z/NZ x Z/NZ. (5.3.24) 


However for char(K) = p and N = p™ we have 


E(R)y & (Z/p™2)", (5.3.25) 
where yz = 0 or 1, (cf. [La73/87], [La88]). 
Assume that (char(K),N) = 1. The field K(Eyn), generated by the co- 


ordinates of all points of Ey, is a Galois extension of kK and the action of 
Gal(K/K) on E(K)y = Z/NZ x Z/NZ determines a representation 
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pn : Gal(K/K) — GLo(Z/NZ) (5.3.26) 


whose image is isomorphic to Gal(K(Ey)/K). The field K(Ey) can be re- 
garded as an analog of the cyclotomic field K(¢y). However it is not in general 
Abelian; only for the so called complex multiplication curves is the analogy 
really far-reaching. This is a basic example of the kind of construction made 
in Abelian and non—Abelian class field theory. 

It is known that the representation det(py) is the cyclotomic character of 
Gx = Gal(K/K); that is, it corresponds to the action of Gx on the group 
un of N* roots of unity. Actually these roots are contained in K(Ey), and 
En is endowed with a canonical non-degenerate alternating Weil pairing 


en: E(K)n x E(K)n — EN. (5.3.27) 


compatible with the action of Gx. This is defined purely algebraically, with the 
help of the functions fp, P € E(K) such that div(f) = NP— No. Calculating 
the pairing for an elliptic curve F over C, given by a period lattice A, we obtain 


n((a+ br)/N, (c+ dr)/N) = exp(27i(ad — bc)/N). (5.3.28) 


5.3.3 Tate Curve and Its Points of Finite Order 

(see [Ta74], [He97], p.343). Let us write again the Weierstrass equation for 
C/(2ni)A > CX/(q2) (w+ exp(u)) 

in the following form 


y? =4x3_P4yx4 4s 


= . + =, / . . _ . 
rH 516 (X = e(27iu, (Q7i)A), X = e (Qriu, (Q7i)A)),w = 2ridu, 


using the Eisenstein series (see also in §6.3.2, (6.3.4)): 


Gi(r= >> ‘(my + mar)-* (5.3.29) 


m1,M2EZ 


2(2ni)* 
(k— 1)! 
_ 2(2mi)* B 

~ (kD! ( 2) 


where the prime denoting (m1, m2) 4 (0,0), 


Yoana 1(n) exp rin = 


2k 
1 FE on 1(n) exp(27inr) 


n=1 


- k! 


Qni)*B 
|-- 1) my 


Ex(7) =1- = aon on-1(n) exp(27inT), 
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On-1(n) = alk d*-1 and By is the k*® Bernoulli number. In particular we 
have that 


Co 


12(27i)* g(r) = Ey = 14 240 De, o3(n)q” (q = exp(2zir7), 


n=1 


~216(2mi)°93(r) = Be = 1-504 $° o5(n)q”. 


n=1 


Let us pass to the variables x, y via the substitution 


1 
X= —,Y= 2 
t+ yo r+ 2y, 


in order to obtain a new equation of this curve (with coefficients in Z|[q]]): 


Tate(q) :y? + ay=2°+ B(q)x + C(q), 


B= 6( 2S 55S (n)q” = 5 at (5.3.30) 
q) = 2 Ss eee = Lg +0. 


(Sat) -»( at) 
240 —504 1  (7n® + 5n3)q” 
Cas 12 = 3 i-qg 


This equation defines an elliptic curve over the ring Z((q)) with the canonical 
differential wean given by 


dx — dX 
Qy+e YY 
E 
g2 = 60G4 = (2ni)* 
vg 26 
=14 = —(27i)§ —. 


Let N > 1 be a natural number. Let us define 
Tate(q’) : y* + ay = 2° + B(g®)a+C(q¥). 


Next we put t = exp(27iu), then the points of order N on Tate(q’) corre- 
spond tot = ¢\q, (0 < i,j < N—1), Gy = exp(27i/N), and their coordonates 
are given by 
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It is important for arithmetical applications for the Tate curve that these 
coordonates belong to the ring Z[¢v, N~+]|[q]]. 


Proof uses the identity 


Qni)e = 
So (ut n)-* = : i De nk 1 eerinu (k>2, weZ). 
neZ “n=1 


We have for the lattice A = 27i(Z + TZ) the following equalities 


X = 9(27iu) = 


(20i)~? | uw? + S~ (( n)~? —(mr+n)~*) | = (5.3.31) 


m,neZ 
(27i)~ uae RS Se Gana *-2¢2)) = 
meZneZ m=1neZ 
S- S- neZ™iutmr)n a) ye S- meztimnr fh 23 
meZn=1 m=l1n=1 12° 


implying the above identities. 


5.3.4 The Mordell — Weil Theorem and Galois Cohomology 


The fundamental arithmetical property of elliptic curves defined over an al- 
gebraic number field K is the following result. 


Theorem 5.9 (The Mordell—Weil Theorem). The Abelian group E(K) 
is finitely generated, that is 


E(K) = E(K)tors 8 Z™, (5.3.32) 


where E(K)tors is a finite (torsion) group, and rg is an integer > 0, called 
the rank of E over K. 


(cf. [Wei79], [La83], [Se97] and Appendix by Yu.Manin to [Mum/74]). 

This theorem is proved in two steps. One first shows that E(K)/nE(K) 
is finite (for some, or every n > 2). Then one uses a descent argument based 
on the following property of logarithmic heights h(P): 


h(P) < const + n~7h(nP + Pp) 


for a fixed Po, variable P and a constant independent of P. 
The weak finiteness theorem for E(K)/nE(K) can be established by a 
kind of Kummer theory for K(E,,). 
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Consider the extension K(+E(K)) of K(E,). One proves that this is a 
finite Abelian extension whose order divides n. This can be deduced from 
Hermite’s theorem on the finiteness of the number of extensions of a fixed 
number field having prescribed degree and ramification points. In order to 
apply Hermite’s result one must check that every ramified prime either divides 
n or is a point of bad reduction of E. 

Now consider the exact sequence 


0 — E, — E(K)—+E(K) — 0. (5.3.33) 
This gives rise to an exact sequence of Galois cohomology groups 
E(K)>E(K) > H"(Gx, En) > H’(Gx, E(K))>H" (Gx, E(R)) 
which can be rewritten as 
0 > E(K)/nE(K) > H'(Gx, En) — H'(Gx,E(K))n 70. (5.3.34) 


Although the group H'(Gx,E,,) is infinite, the image of E(K)/nE(K) is 
contained in a finite subgroup, which we shall describe in geometric terms. 
An element a € H'!(Gx,E,) corresponds to an n-fold covering of E over 
K, that is toa map a: C — E of algebraic curves, which becomes isomorphic 
ton: E® K — E®@K when the ground field K is extended to K. Given 
such a covering, one constructs a 1-cocycle by choosing a point P € E(k), 
an inverse image Q = a~'P, and a point Q; € E(K), which corresponds to 


Q under a structure isomorphism C(k) © E(k). Then one defines a as the 
class of the cocycle: 


HA =Q-Qi€E, (¢ € Gx) (5.3.35) 


(subtraction refers to the group law on E; we shall later on denote it by + 
instead of *). Elements 8 € H'!(Gx,E(K)) are interpreted as isomorphism 
classes of the principal homogeneous spaces X of E over K, that is, curves X 
given together with group actions FE x X — X which become isomorphic to 
the addition morphism of E when the ground field is extended to K. Given 
such an X, choose a point P € E(K), a point P, € X(K) corresponding to 
P under a structure isomorphism, and define a cocycle 


a+ 6, =P,—PP Ee E(K) (co € Gx). (5.3.36) 


A different choice of P leads to a cohomological cocycle. The cohomology class 
is trivial iff X has a rational K—point. This establishes a direct connection 
between Galois cohomology and Diophantine geometry. 

The exact sequence (5.3.34) can be conveniently described in this setting. 
A point P € E(K) determines an n—covering 


tpnge :E= EB, (5.3.37) 
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where tp is the translation by P. Now choose a point Q € E eK ) such that 
n@) = P. Then tpn = ntg, so that the translation tg : FE ® K-E®@K is 
a K-isomorphism of algebraic curves, turning (5.3.37) into multiplication by 
n. Therefore, our n-covering becomes trivial over K’ = K(4E(K)). Hence its 
class belongs to the finite subgroup 


M,, = Infl(H*(G(K'/K), En)) C H* (Gx, En), 


whose order can be bounded in terms of the degree and ramification of K’. 
This finishes the proof of the weak Mordell—Weil theorem. 
The descent argument proceeds as follows: choose a finite number of rep- 
resentatives 
Pigs wieks 


of E(K)/nE(K). There is a constant C’ such that if h(P) > C, then 


h (iP = P)) < h(P), 


where P; is congruent to P modulo n. Hence P can be represented as a linear 
combination of 
Pipesay Pe 


and points of height <C whose number is finite. 

The exact sequence (5.3.34) can be used to obtain upper bounds for the 
rank rz. In fact, ifn = pis a prime and M is a finite subgroup of H'(Gx, Ep) 
containing the image of E(K)/pE(K) then (5.3.34) shows that 


TES rkz/pz(M) yam tkz/pz(E(K)p)- (5.3.38) 


Any improvement on this bound would require an understanding of the co- 
kernel of the map 
E(K)/pE(K) > M. 

In order to choose a small, well-defined M, it is convenient to apply the 
usual local—to-global constructions. For each place v of K, choose an exten- 
sion w of v to K and denote by G, C Gx the corresponding decomposition 
subgroup G, & G(K/K,). Then for an arbitrary Gx-module A we have 
restriction homomorphisms H*(Gx, A) > H'(G,, A). In our setting, these fit 
into the commutative diagram 


Qa 


0—> E(K)/nE(K) — H'\(Gx,En) — H'(Gx,E(K))n 0 


| i: 6=|[4] 


abe nel] aX (Gy, En) aha (Gx, E(Kw)) x 0 
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in which 6, denotes the composition of the restriction morphism and the 


morphism induced by the inclusion E(k) — E(Kw). 
Let us consider the group 


IW(£, K) = (J W(£,K)n, Ul(E, K)n = Ker(§). (5.3.39) 
neN 


This group is called the Shafarevich-Tate group of E over K; its interpre- 
tation in terms of the Brauer group and connection with the Brauer—Manin 
obstruction is explained in [Man7O0b]. In our setting, an element of II(F, K) 
corresponds to a principal homogeneous space of E over K (up to isomor- 
phism) which has a K,—point in every completion of K. 

The group 


S(E, K), = a7~'(UI(E, K),) (5.3.40) 


(and the inductive limit of these groups over all n) is called the Selmer group of 
E. An element of S(£, K),, can be interpreted as (the class of) an n—covering 
C' — E such that C has a K,,—point in each completion K, of K. By definition 
we have an exact sequence 


0 > E(K)/nE(K) > S(E, K)y > I(E, K)n — 0. (5.3.41) 


One can say that I(£,K) is a cohomological obstruction to a calculation 
of E(K). There is a conjecture that II(E, K) is finite. This was proved by 
K.Rubin in [Rub77]| for certain curves with complex multiplication, and by 
V.Kolyvagin (cf. [Koly88]) for a class of curves uniformized by modular curves. 
More recently, these results were extended to a classe of curves without com- 
plex multiplication by K.Kato, cf. [Scho98]. 

We shall return to this question in Chapter 6 in connection with zeta- 
functions and modular functions. 

We now consider in more detail the properties of the height function hp : 
E(K) — R corresponding to a divisor D, or, equivalently, to the invertible 
sheaf O(D) of degree d on an elliptic curve E. Since the degree of the map 


np is n?, one can check that 


hpong ~ nhp. (5.3.42) 


More precisely the following limit exists: 


hp(«) = lim hp(2% a) /2?%. (5.3.43) 


This limit hp is called the Néron- Tate height. 
If the divisor D is ample (see 5.1.13) then hp is a quadratic form on E(Kx), 
which is positive definite modulo torsion. Moreover its natural extension 
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hp: E(K)®zR-R 


is of the form dbp, where d = deg(D) and 0p is a positive definite quadratic 
form independent of D. The kernel of the natural map E(K) — E(K) ®z 
R is the finite torsion subgroup E(K)tors; its image is a lattice in the rg— 
dimensional Euclidean space with the scalar product 


(P,Q) = 5 [bo(P + Q) — bo(P) — (Q)I]. 


Therefore, the region hp < log(B) in this space is a ball of radius 
(d~" log(B))"”?. 


The number of points in this ball is asymptotically proportional to its volume, 
that is const - (log(B))"/?. The constant in this expression depends on the 
volume of a fundamental domain for the lattice E() mod torsion, that is, on 
the regulator of E over K: 


H = H(E,K) = det((P,, P;))'/”. (5.3.44) 


B.Mazur has proved that Card(E(Q)tors) is universally bounded (cf. [Maz77]). 
This result was extended to all number fields by L. Merel, cf. [Mer96]. Actually 
Mazur showed that E(Q)tors is always isomorphic to one of the following fifteen 
groups: 
Z/mZ (m<10,m=12), Z/2Zx Z/2vZ (v < 4). 

All these groups arise in this way. 

It is conjectured that there are elliptic curves of arbitrarily large rank over 
Q. J.-L. Mestre constructed curves of rank rg > 14 ([Me82]), by choosing 
equations in such a way that their reductions modulo many primes p have as 
many points modulo p as possible. A concrete example of a calculation of the 
group E(Q) is given in [Maz86]. Consider the curve 


E: —206y? = 2? —2?+1/4 


and three points on it 


Point 2£ y  Néron — Tate height 
P, -15/8 7/32 1.52009244 
P, —55/8 43/32 2.05430703 


Pz —55/98 47/1372 2.42706090 


A descent argument shows that rg < 3, and a height computation allows 


one to conclude that P;, P2, P3 are linearly independent generators of E(Q) = 
Z?; The absence of torsion can be checked by p-adic calculations. 
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For a given elliptic curve, the numbers |E£(Q)tors|, "ze, H(£, K), |II(E, K)| 
(conjecturally finite), and the conductor (a product of primes of bad reduction) 
are the most important arithmetical invariants of E. Later we shall see that 
all these invariants are combined (partly conjecturally) in the properties of its 
zeta-function (see §6.4.4). 


5.3.5 Abelian Varieties and Jacobians 


(cf. [Mum74], [La58], [Wei48]). Abelian varieties are multi-dimensional gener- 
alizations of elliptic curves. By definition, an Abelian variety A over a field K 
is a non-singular projective variety, together with a group structure given by 
morphisms over Kk: 


AxA-A ((a,y) a+y), AOA (G@H—-x). 


One can prove that any such structure is commutative, which justifies the 
additive notation. 

A homomorphism of Abelian varieties is a morphism \ : A — B of al- 
gebraic varieties which is a group homomorphism. If dim(A) = dim(B), the 
surjectivity of \ is equivalent to the condition that the kernel of X is finite. 
If these conditions are satisfied then \ is called an isogeny, and A and B are 
said to be isogenous. 

In particular, multiplication by an integer m4: A— A, ma(x) = ma, is 
an isogeny of degree m?9, g = dim(A). If the characteristic of the ground field 
does not divide m, then 


Am = A(K)m = Ker(ma) & (Z/mZ)°9. 


In particular the action of the Galois group on A,, defines a Galois represen- 
tation 


Pm: Gr — Aut(A,) C GLe2,(Z/mZ). (5.3.45) 


These representations are the best studied examples of the general Galois 
actions on Grothendieck’s étale cohomology groups. As in the case of elliptic 
curves, there is a non-degenerate alternating Weil pairing 


€m : A(K)m Xx A(K)m > bm- (5.3.46) 
This is compatible with the action of the Galois group, so that 
Im(pm) C GSpg(Z/mZ) C GLag(Z/m2), 
where GSp, is the group of symplectic matrices: for an arbitrary ring R, 


GSp,(R) = {M € GL,(R)|M'J,M = p(M)Jg, u(M) € RX}, — (5.3.47) 
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where Jy is a standard symplectic matrix. Actually, the construction of em 
depends on the choice of a polarization on A (cf. below). 

If A is an Abelian variety defined over C, the complex variety A(C) is 
isomorphic to a complex torus C2/A, where A is a lattice in C9. Not every 
complex torus, however, can be obtained in this way. A necessary and sufficient 
condition for this is the existence of an R—valued, R-bilinear form F(z, w) with 
the following properties: 


E(z,w) = —E(w, z). (5.3.48) 
E(z,w) € Z for all z,w € A. (5.3.49) 
E(z,iw) is an R — bilinear, symmetric, positive definite form. (5.3.50) 


Such a form £ is called a Riemannian form on the complex torus C9/A. 
It also defines a Hermitean Riemannian form on C9: 


A(z, w) = E(iz,w) +iE(z,w). (5.3.51) 


If such a form F exists at all, it is not unique. We shall say that a choice of 
E defines a polarization of A. 

An Abelian variety together with a polarization is called a polarized 
Abelian variety. 

We recall the following classification theorem for non-degenerate alternat- 
ing integral forms on a lattice A & Z?9: for each form, there exists a basis 
{A1,..-;Aag} of A such that 


E( Ai, Ag) = E(Ag+is Agt3) = 0 for 1<i,j<g, 
EQ\G, Ag+) = eibig for 1 < i,j < g, 


where €1,...,€g are natural numbers, 
e€1|€2, seey €g—1|€g- 


Clearly, 
det 4(E) = (e1€2...€g)?. 


A polarization with determinant 1 is called a principal polarization. 

There is a totally different definition of polarization, which is purely al- 
gebraic and is valid over any ground field. Namely, consider an arbitrary 
projective embedding A <> P%. Call two embeddings equivalent if one can 
be obtained from the other by a projective transformation composed with a 
translation by a point of A. An equivalence class of projective embeddings 
defines a linear system of hyperplane sections D of A. Over the complex 
ground field, this gives rise to an integral 2-cohomology class of A(C), which 
in turn defines a Riemannian form FE, in view of the known structure of the 
cohomology ring of a torus. Elaborating this correspondence, one obtains the 
following 


228 5 Arithmetic of algebraic varieties 


Definition 5.10. An (algebraic) polarization of an Abelian variety A is a 
class of ample divisors {D} up to algebraic equivalence. 


5.3.6 The Jacobian of an Algebraic Curve 


([La58], [Wei48]). Let X be a non-singular projective curve over a field kK. One 
defines in an invariant way an Abelian variety J = Jx, which parametrizes 
the invertible sheaves (or divisor classes) of degree zero on X. This Abelian 
variety is called the Jacobian of X. For kK = C, its structure is essentially 
described by Abel’s theorem. Consider a divisor 


a=> nF, Spe. 


We have a = OC where C is a 1-chain. Choose a basis of the differentials of 
the first kind 


AWigse2 tg} 


on X, where g is the genus of X. Consider the point 


(fon. f #0) Ec’. 


Since one can replace C’ by a homologous 1-chain, this point is only well 
defined modulo the period lattice H;(X, Z) of our basis. Abel’s theorem asserts 
that the map sending a to the class of this point in the torus C9/H1(X,Z), 
identifies this torus with the group Jx(C) of all classes of divisors of degree 
ZeEYO. 

The classical Riemann periodicity relations imply that the lattice H)(X, Z) 
is seldual with respect to a canonical Hermitean metric. Hence 


Hy (X) © C9/Hi(X,Z) 
where H,(X,Z) denotes the Pontryagin character group of H,(X,Z). This 
shows that Jx can be considered as an algebraic avatar of the 1-cohomology 
of X. 


Properties of Jacobians. 


1) dim(Jx) = g (the genus of X). 

2) Jx is an Abelian variety, and for every extension field L of K, the group 
Jx(L) is canonically isomorphic to the group of divisor classes of degree 
zero on X with ground field extended to L. 

3) Every morphism of curves of finite degree f : X —> Y determines a 
functorial homomorphism f* : Jy — Jx, corresponding to the inverse 
image map on divisor classes. 
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4) Jx has a canonical principal polarization. This has an algebraic description 
as the class of the Poincaré divisor 0. The Poincaré divisor can be defined 
as follows. Start with Abel’s map 


pix > Ix, (5.3.52) 


which sends a point « € X(K) to the divisor class cl(a — P), where 


P € X(K) is some fixed point. Consider the map 
pee Je Be ry, 


where py is the addition map. From the Riemann—Roch theorem it follows 
that w is surjective. Put 6 = y(X971). 


Many geometric and arithmetical properties of a curve X can be read 
off from the properties of its Jacobian. In particular, the classical theorem 
of Torelli (cf. [Wei57]) states that X can be uniquely reconstructed from Jx 
together with its canonical principal polarization. Essentially this theorem 
was used in Faltings’ theory and in earlier constructions due to A.N.Parshin 
and Yu.I.Zarkhin (cf. [Zar74], [Zar85], [Par71], [Par73], [PZ88]). 

If X is defined over K, the Jacobian and its principal polarization are both 
defined over K. If X has a K—point P, the map (5.3.52) is also defined over 
K. 

One can also prove that if X is an algebraic curve over an algebraic num- 
ber field K having good reduction modulo a prime p C Ox, then Jx with 
its canonical projective embedding (given by the divisor @) also has good 
reduction. 

Every Abelian variety A over a number field (or absolutely finitely gener- 
ated field) K satisfies the Mordell — Weil theorem: A(K) is a finitely generated 
commutative group, that is 


A(K) = A(K)tors @ Z™, 


where A(FC)tors is finite and r4 is the rank of A over K (cf. [La83], [Se97] and 
Appendix by Yu.Manin to [Mum/74]). 

As with elliptic curves, one can define the Selmer groups S(A,K)m and 
the Shafarevich — Tate groups II(A, A). A standard conjecture is that the 
latter are all finite. 

Every divisor D on A determines a Néron — Tate height 


hp: A(K)@R--R, 


and if D (that is, O(D)) is ample, then hp induces a Euclidean metric on the 
ra—dimensional vector space A(K) @ R. 

A very important role in the theory of Abelian varieties is played by 
the endomorphism ring End(A) of A (over K) together with the Q-algebra 
End(A) @ Q. It is known that this algebra is semi-simple. 
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The Abelian variety A is called simple if End(A) @Q is simple. A decompo- 
sition of End(A) ® Q as a sum of simple algebras R; @---@ R, corresponds to 
a decomposition of A into a product of simple Abelian varieties up to isogeny: 
there exists an Abelian variety 


B=B,x::-x Bs 


isogenous to A such that End(B;) ® Q & R; ([La5g}). 

Let EF be a Riemannian form corresponding to a polarization of an Abelian 
variety A over C. Such a form determines a Rosatti involution p on 
End(A) ® Q (that is, an anti-isomorphism of order 1 or 2) which verifies the 
relation E(\x,y) = E(x, °y) for every \ € End(A) @ Q. Involutions of this 
kind can also be defined over a ground field of finite characteristic. 

Semi-simple algebras with involutions have been classified, cf. [Mum/74], 
[Shi71]. 

If K is a number field, g = 1, then either End(A) ® Q = Q or End(A) ®Q 
is an imaginary quadratic field k. In the latter case A is called an elliptic curve 
with complex multiplication. It can be represented as a complex torus C/A, 
(see (5.3.15)) with + € k, Im(r) > 0. 

We now sketch an analytic construction of the space A, of isomorphism 
classes of Abelian varieties over C with principal polarizations. The crucial 
observation is that each such variety can be represented as a complex torus 
C9/A,, where 


A=A, = {n, + net | r1,n2 € Z, 7 € Hy} (5.3.53) 
and H, is the Siegel upper half space 
H, = {7 € GL,(C) | Im(r) is positive definite}. 


In fact, let A be an Abelian variety with a principal polarization, given as 
a torus C9/A, and a Riemannian form EF on A with determinant 1. Choose 
a symplectic basis {w1,w2,-+- ,weg} of A. Representing w; by its column of 
coordinates, we can construct a (g x 2g)-matrix 


Q — (wy, We2,+°° , Wg) 


which is called a period matrix of A. Put Q = (2), 22) where 2; € M,(C). 
From (5.3.48) and (5.3.50) it follows that 


QM — 2.0% =0, (5.3.54) 


2i(Q2, — 2:75) > 0 is positive definite. 


Thus 2,22 € GL,(C) and 7 = 25°42) € H,. From this one deduces that 
the complex variety A(C) is isomorphic to the torus C9/A,, and the initial 
polarization corresponds to one given by the form 
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E(a, + 7y1,22 + Ty2) = Tye — 231, 


where 2;,y; € RY. 
The varieties C9/A, and C9/A, are isomorphic iff 


7 =(Ar+B)(Cr+D)™ 


for a certain matrix M = 6 n) from the group 


Sp,(Z) = {iw = & € SLo,(Z) | M*J,M = My}. (5.3.55) 
This group is called the Siegel modular group of genus g. 

Summing up, we see that A, can be described as the quotient space 
H,/Sp,(Z) where M acts on Hy by matrix fractional linear transformations. 

One can show that A, is a complex analytic space of dimension g(g+1)/2 
with a natural structure as a normal quasi—projective variety defined over Q. 
A generic Abelian variety over C is simple, and its endomorphism ring is Z. 

There are important variations of this construction. One can consider fam- 
ilies of pairs A, E in which End(A) and E verify some additional constraints, 
and one can supply such pairs with so called level structures, for example a 
choice of symplectic basis for the subgroup A,, of points of order m. In many 
situations there exist universal PEL-families (Polarization, Endomorphisms, 
Level), whose bases are very important algebraic varieties (Shimura Varieties) 
defined over number fields. The action of the Galois group on the Algebraic 
points of these varieties can be described in considerable detail. 


5.3.7 Siegel’s Formula and Tamagawa Measure 


Algebraic groups comprise not only of Abelian varieties but also of linear 
groups. The latter are affine varieties, whereas Abelian varieties are projec- 
tive. The arithmetic of linear groups is a well-developed chapter of algebraic 
geometry. For an extensive report on its qualitative aspects we refer the reader 
to the papers of [P182], and [PlRa83]. We shall describe here only classical re- 
sults due to C.-L. Siegel, which were generalized and reinterpreted by Weil. 
These results give a quantitative form to the Minkowski-Hasse principle for 
quadratic forms, and lead to certain precise formulae of the kind furnished by 
the circle method for the principal terms of some arithmetical functions. 
Siegel’s formulae concern the equations 


S[X]=T (S[X] = X*SX) (5.3.56) 


where S € M,,(Q) and T € M,,(Q) are the symmetric matrix forms of Q- 
rational quadratic forms qg and qr, the solutions X being in My n(Q). 

Let us consider in more detail the case when S and T are the matrices of 
integral positive definite quadratic forms corresponding to the lattices Ag C 
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R™, Ar C R® (in the sense that gg and gr express the lengths of elements of 
Ag, resp. Ar). Then an integral solution X to (5.3.56) determines an isometric 
embedding As — Ar. Denote by N(S,T) the total number of such maps, 
which is also called the number of integral representations of gr by qs. The 
genus of gg is by definition the set of quadratic forms, rationally equivalent 
to gg. The genus consists of a finite number of classes with respect to integral 
equivalence. Let I be the set of these classes. One of Siegel’s formulae gives 
the value of a certain weighted average of the numbers N(S,,7') over a set 
of representatives S, for classes x € I of forms of a given genus. To be more 
precise, denote by w(a) the order of the group of orthogonal transformations 
of the lattice Ag, and define the mass of S by the formula 


1 


w(x)’ 


Mass(S') = S- 


cel 


(5.3.57) 


Assume that N(S,,7) 4 0 for at least one x (or, equivalently, that there is an 
isometric embedding Ag ® Q > Ar @ Q) and put 


2 1 N(S,,T) 
N(S,T) = Sissstsy 2 way 


wel 
Siegel’s formula expresses this average as a product of local factors 

N(S,T) = Cm—np Qoo(5,T) |] ap(S,T), (5.3.58) 
Pp 


where c; = 1/2, cg = 1 for a > 1, and the proper local factors are defined as 
follows. For a prime p denote by N(S,7;p") the number of solutions of the 
congruence 


S[X]=T (mod p") (X € My, »(Z/p’Z)) (5.3.59) 
and introduce the local density 


Qp(S,T) = lim ¢m—n4iN(S,T;p")p™?, d=mn—n(n+1)/2 (5.3.60) 


(the expression inside the limit actually stabilizes when r is sufficiently 
large). One can define a.(S,T) similarly, replacing the p-adic measure by 
an Archimedean measure. Consider a neighbourhood V of a matrix S in the 
space of symmetric matrices {T = (t;,;) € Mm(R) | T’ = T} with the measure 
given by the volume form a7 = Aj<j;dt;,;. Put 


U= {X = (4) € Minin (R) | X'SX €E V}. 


It is a subset of Mm »(R) with a measure 6 = A, jdu,j (i = 1,...,m;j = 
1,...,n). Finally put 
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Ju Bx 


V=s far 


Aoo(S,T) = Cm—n4i jim (5.3.61) 


The product in (5.3.58) converges absolutely if m > 3 and m—n#F 2. 
In the special case T’ = S' we have 


1 


N(S,T) = Nass(S) 


and (5.3.58) becomes the Minkowski—Siegel formula ([Se86], p. 671) 


Mass(S) = CmQoo(S,S)~ Ale S,S)7 (5.3.62) 


Ifn =1, T = (t) then N(S,T) is the number of integral representations of 
a positive integer t by the quadratic form qs. 

Note that for almost all primes p (i.e. for all but a finite number) each 
solution to the congruence (5.3.59) can be lifted to a solution of the corre- 
sponding congruence modulo any p” (using Hensel’s lemma). In this case we 
have 


N,(S,T; 
(ST) = es (5.3.63) 


which makes it possible to describe explicitly almost all the local factors in 
(5.3.58) and to express this product in terms of special values of certain zeta-— 
functions (for example, values of the Riemann zeta—function at integers). 

Consider for example the quadratic form gg = )7)", 27 given by the iden- 
tity matrix S = I,,. If m is divisible by 4, then (5.3.62) takes the following 
form ([Se86], p.673): 


Mass(Im) = (1 — 2—k)(1 + €2*) | BoBa--- Box|/4k!, 


where k = m/2, ¢ = (—1)"/? and B,; is the i** Bernoulli number. For m not 
divisible by 4 there are exactly two classes in the genus of the form J9, and 
Mass(I9) = 17/2786918400. 

We now say a few words on how Siegel’s formula is proved. The proof 
uses the theory of integration over the locally compact group G = O,,(A) of 
orthogonal matrices with respect to S with coefficients in the ring of adeles 
A. The group Gx = On(R) is compact in view of the positive definiteness of 
S. Thus G contains the compact open subgroup 22 = Ga x Ils G(S,), where 
G(S,) = O(Z,) is the orthogonal group of the p-adic lattice As, = As ® Zp 
(preserving the quadratic form qs). The subgroup I’ = O,,,(Q) of orthogonal 
matrices with rational coefficients is discrete in G and [NM Q = Aut Ag is 
the finite group of automorphisms of the lattice Ag. For every x = (ay), EG 
with (v = p or v = ov) one can define a lattice Ag, such that As, @Q) = 
Ly(Ag ® Q,). According to a version of the Hasse-Minkowski theorem, there 
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is an isomorphism As, ®Q = Ag ®Q and the double cosets QaI° of G modulo 
2 and I’ can be interpreted as Z-classes of forms S, (a € I). The finite group 
Ye = 2 a0x~! of order w(x) is the group of automorphisms of the lattice 
Ag,. Below a normalized Haar measure 7 on the group G will be constructed. 
This measure is invariant under both right and left group shifts, and has the 
property that the volume vol(G/I’) of the compact set G/T = Uger Qal'/T 
is uniquely defined (not only up to a multiplicative constant). This measure 
is called the Tamagawa measure on G. The following formula holds 


vol(G/I) = S¢ vol(2/yx) = vol(2) S* (5.3.64) 


vel cel w(2) 


Let g, y be closed subgroups of G and suppose that the volume vol(g/7) 
is finite. Consider a continuous function y with compact support on G/g, 
invariant under left shifts of the argument by elements of 2. For 7 € G put 


Ne(y) = >> y(zy). 
yel/y 


This sum is finite and depends only on the double coset 2x. Consider the 
weighted average N(v) of the quantities N(y) as x runs through J, 


Sig) = Laer Nelv)/wle) 


N(y 5.3.65 
(y) Fe te) ( ) 
Standard integration techniques then show that 
R vol(g/7) 
N(y) = aes | p(a) dx (5.3.66) 
vol(G/I’) G/g 


assuming that the measures on the groups G and g and the homogeneous 
space G/g are compatible. 

Siegel’s formula can be deduced from equation (5.3.66) by taking for g 
the orthogonal adelic group with respect to the quadratic module W over Q 
defined by the condition W @ (Ar ® Q) = As ® Q. For the group y we take 
the group of rational points in g, and the homogeneous space G'/g is identified 
with the set of embeddings Ap ® A > Ag ® A preserving the quadratic forms. 
For y one takes the characteristic function of the set of those embeddings 
Ar ® A — Ag ® A which take Ap ® Zy into Ag ® Zp. The quantities cm—n 
and cm become the Tamagawa numbers T(Om—n) and T(O,) respectively. 
For x = (2y)y € G the function y(x) = y(gx) has the form [],, y,(2,), where 
Yo = lon Gy and y,(#,) is the characteristic function of O,,(Z,). The 
integral in (5.3.66) is therefore equal to the product 


AXo6 * i dXy, 
as II Gp/9p 
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where Gp = Om(Zp), Ip = Om—n(Zp), and one easily verifies that 


Qo (5, T) -| AX, ap(S,T) =i dxp. (5.3.67) 
Goo /Goo Gp/9p 


Then the evaluation of 
T(Om) = vol(G/T) 


can also be made using (5.3.66) putting n = 1 and applying some known 
asymptotic results for the representation numbers N(S,T) as t — oo. The 
latter are obtained for example by the circle method (the cases m = 2,3 must 
be treated separately). 

Now we describe the Tamagawa measure on G; formulae (5.3.67) follow 
from this description (cf. [CF67], chap. X). 

Let V be an algebraic variety over a number field K, which is a connected 
linear algebraic group. If dim V =n then there is a non-vanishing, left invari- 
ant n-form w on V defined over K. Any two of these differ by a multiplicative 
constant \ € kK”. We now construct a measure on the group V(Ax) of adelic 
points of the variety V. For this purpose one must first fix a Haar measure [ly 
on the additive group K;, where v is a normalized valuation on K. In order 
to do this we set 4,(O,) = 1 if v is non-Archimedean, du, = dx for Ky, = R 
(Lebesgue measure) and du, = |dz A dz| for z = x+iy € K, = C. Then 
according to (4.3.46) one has u(Ax/K) = |Dx|!/?, where Dx is the discrim- 
inant of K and yp is the Haar measure on Ax defined as the product of local 
measures jy. Define a measure w, on V(Ax) as follows. In a neighbourhood 
of a point P of V the form w is defined by the expression 


w= f(x) dry A-+-A dtp, 


where 21,...,2, are local parameters at P which are certain rational func- 
tions z; € K(V) and f € K(V) is a rational function regular at P. The 
function f can be written as a formal power series in the x;s with coefficients 
in K, because the variety of an algebraic group is always non-singular. If 
the coordinates of P belong to K, then f is a power series in the variables 
x; — x9 with coefficients in K,, which converges in a neighbourhood of the 
origin in K?. Thus there exists a neighbourhood U of P in V(K,) such that 
p: a — (t(2),...,tn(x)) is a homeomorphism of U onto a neighbourhood 
U’ of the origin in AK’, and the power series converges in U’. In U’ we have 
the positive measure |f(x)|, dti-...- dtp, where dt, -...- dt, is the product 
[by X +++ X fly On KP; we lift it to U using y and thus obtain a positive mea- 
sure w, on U. Explicitly, if g is a continuous real valued function on V(K,) 
supported on U then 


[awe fale t)dtr-... dt, 


so that wy is in fact dependent on a choice of local parameters. If the product 
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[[+.(V(.)) (5.3.68) 


converges absolutely, then we define the Tamagawa measure by the formula 


r= |Del°? [] . (5.3.69) 


If the product (5.3.68) does not converge absolutely then one needs to intro- 
duce certain correcting factors A, > 0, which ensure the convergence in such 
a way that the product 

[] \.7(V(0.)) 

v Noo 
will converge absolutely. The Tamagawa measure (with respect to {A,}) is 
then defined by the formula 


r= |Del"? T] Ap ey. (5.3.70) 


v Noo 


In any case, it follows from the product formula that 7 is independent of the 
choice of w: if we replace w by cw (c € K™*) then (cw), = |c|"w, and by the 
product formula (4.3.31) one has [T,, |cl, = 1. 

Let k(v) denote the residue field with respect to a non-Archimedean place 
v and let V) = V @ k(v) be the reduction of V modulo the corresponding 
prime ideal p, C O,. Then one can show, generalizing Hensel’s lemma, that 
for almost all v 


wy(V(O,)) = Nu~” Card V™ (k(v)), (5.3.71) 


where Nv denotes the number of elements of k(v) and V“°)(k(v)) is the group 
of points of V‘°) with coefficients in k(v). 


Examples. If V = I, (the additive group) then 
Wy(V(Or)) = Mv(Ov) =1; 
if V =I» (the multiplicative group) then 


_Nv-1_, 1. 


wy(V(Ov)) Ny No’ 


if V =GL,, then 


Wy(V(On)) 
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if V =SLm then 
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The product 


TI (1 goa) = cer 


vtoo 


converges for Re(s) > 1 but diverges at s = 1 (here Cx(s) denotes the 
Dedekind zeta function of Kk’). The product [],, w,(V(O,)) therefore converges 
for V = SL,, but diverges for V = GL,,. In the latter case one could take 
for the correcting factors the numbers A, = 1 — ae More generally one can 
show that if V = G is a semi-simple algebraic group then the product (5.3.68) 
converges absolutely and the correcting factors are not needed. The link be- 
tween Tamagawa numbers and Siegel’s research in the arithmetical theory of 
quadratic forms was discovered by Weil in the late 50s. He formulated during 
this time a conjecture later proved by Kottwitz, saying that for a connected, 
simply connected, semi-simple algebraic group over a number field AK, which 
contains no factors of type Eg, one has r(V) = 1. For a connected, reductive 
group G over K it was proved by Sansuc and Kottwitz that 


|Pic(G)| 
oe Tra) 
where III(G) is the Shafarevich—Tate group and Pic(G) is the Picard group 
of the affine variety (linear algebraic group) G, cf. [Kott88]. 

Eskin, Rudnick, Sarnak in [ERS91] gave a new proof of Siegel’s famous 
mass formula; they used harmonic analysis to obtain an asymptotic formula for 
the distribution of integral points on certain affine varieties. In particular, they 
gave a new proof of Siegel’s theorem for indefinite quadrics (n = 1, m > 4). 
From this it was deduced that the Tamagawa number of any special orthogonal 
group is 2, which yields the general Siegel result through a computation of 
adelic volumes with respect to the Tamagawa measure. Note that E. Peyre 
studued in [Pey95] heights and Tamagawa measures on Fano varieties. 


5.4 Diophantine Equations and Galois Representations 


5.4.1 The Tate Module of an Elliptic Curve 


Let £ be an elliptic curve defined over a number field kK. Then the Galois 
group Gx = G(K/K) acts on the group E,, of all points of order dividing n, 
Ey, & (Z/nZ)? so we obtain a Galois representation 


gn: Gx > GLo(Z/nZ) = Aut Ey. 
Now let / be a prime number, n = /”. Set 


Ti(B) = lim Bym © 2, (5.4.1) 


m 


V(E) = T(E) ®Q = Qj, 


where Z, is the ring of J-adic integers and the limit is taken over the set of 
homomorphisms Ejym — Ejm-1 which multiply each point by I. The corre- 
sponding homomorphism 


pi: Gr — Aut Vi(E) = GL2(Q:) (5.4.2) 


is a continuous representation of the group Gx over the field Q;. Its image 
Im py = G_ is a closed subgroup of GL2(Z,) ~ Aut T;(F), and the Weil pairing 
(5.3.37) determines an isomorphism of dety; with the representation of Gx 
on the one dimensional vector space 


Vi(u) = Ti(w) @Q,  Ti() = lim pm 


m 


(the Tate module defined as the projective limit of roots of unity of l-power 
degree). 

It follows from recent results of Faltings that the Gx-module T;(£) 
uniquely determines the curve FE upto an isogeny. 

Serre discovered (cf. [Se68a]) that the image Im ,; is as large as it could 
possibly be for almost all primes /. More precisely this image coincides with 
GL2(Z;) = Aut T(E), provided that the curve E is not special in the sense 
that it admits no complex multiplication, or equivalently Aut(£) = Z. More- 
over the index of the subgroup ¢,(Gx) in GL2(Z/nZ) = Aut E,, is bounded 
by a constant which is independent of n, of the curve E and of the field K (cf. 
[Ner76], [Silv86]). The occurrence of small images Im 9; is closely related to 
the existence of K-rational points of finite order (or of K-rational subgroups 
of such points). For example, if there exists a basis P,Q of the group E, over 
Z/nZ such that the point P is K-rational, ie. P € E,(K), then P? = P for 
all o € Gx. Elements in the image y,(Gx) are therefore represented by ma- 
trices of the form G *) in GL2(Z/nZ). If the subgroup (Q) is also K-rational, 
i.e. (Q)” = (Q) then elements in the image have the form (a as The result 
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of Serre is therefore closely related with Mazur’s theorem on the universal 
boundedness of the torsion subgroup of an elliptic curve over Q (cf. [Maz77]). 

Let A be an Abelian variety of dimension g defined over K. Then the Tate 
module is defined by 


T(A) = lim Ker(A & A) © 279, 
Vi(A) = Ti(A) 2 @ = Q)?, 
and we again have a Galois representation (see (5.3.52)) 


pi: Gx — Aut V;(A)& GSp2, (Qi). (5.4.3) 


Note that certain results are known on the maximality of the image of 
the Galois representation p; for higher dimensional Abelian varieties A with 
End A = Z (i.e. without complex multiplication). 

The study of the image of »; is based on an examination of the reduction of 
the elliptic curve (or Abelian variety) modulo p,, where v is a finite place of K. 
The condition that E has good reduction E, = E mod p, is equivalent to the 
existence of an Abelian scheme FE, over Spec O, in the sense of Mumford (cf. 
[Mum74]) whose generic fiber coincides with EF (ie. Ey @o0, Ky = E@KK,) 
and whose closed fiber is an elliptic curve (Abelian variety) E, = E, @o,, k(v) 
over the residue field k(v) = O/py. The (geometric) Frobenius endomorphism 
F,, of E, is defined by raising the coordinates of points on E, to their Nu = 
|k(v)|*® powers. 

Now let p, denote the characteristic of the residue field k(v) and let J be 
another prime number (not p,). Denote by G, (respectively I) the decom- 
position group (respectively, the inertia group) of some extension U of v to 
a fixed algebraic closure K of K (compare with (4.4.2)). If E has good re- 
duction at v then defines (in view of Hensel’s lemma) an isomorphism from 
E\m to the corresponding subgroup of the curve E,.In particular, the inertia 
group I, acts trivially on Eym, T;(E£) and V;(£), so the action p;(F'r,) of the 
arithmetical Frobenius automorphism F'r, is well-defined (F'r, € G,/I,) and 
is the same as the action of the geometric Frobenius Fi, = F'g,,. One therefore 
has 


det p;(F'r,) = det(F,) = Nv = Card k(v), (5.4.4) 
and the quantity 
det(12 — p(Fry)) = det(1 — F,) =1—Tr F, +Nv (5.4.5) 
is equal to the number Card E,(k(v)) of k(v)-points of the reduction Ey. 
Conversely, one has the following 


Theorem 5.11 (Criterion of Neron — Ogg — Shafarevich). If the Galois 
representation p, is unramified at v for some lF py then E has good reduction 
at v. 


(cf. [Silv86], Ch. 4 of [Se68a]). 
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5.4.2 The Theory of Complex Multiplication 


(see Chapter XIII of [CF67], [La73/87], [Shi71]). One of the central aims of 
algebraic number theory was formulated in 1900 by Hilbert in Paris as his 
twelfth problem: that of finding an explicit construction of all Abelian exten- 
sions of a given number field K. For K = Q it is known (by the Kronecker- 
Weber theorem (comp. with §4.1.2)) that the maximal Abelian extension Q?> 
of Q is cyclotomic, and that there is an isomorphism 


G(Q? /Q 2 II ZL. . 


If K is an imaginary quadratic extension of Q then the theory of complex 
multiplication makes it possible to construct K@? using elliptic curves E with 
complex multiplication by K, and their points of finite order. By definition, 
one has for such curves End E ® Q= K. If E(C) = C/T for a lattice ' c C, 
then the endomorphism ring of F has the following form 


End E={ze€Clzi Cr}=O;=Z+ fOx C Ox, 


where Ox is the maximal order of K and f is an appropriate positive integer 
(in view of the fact that every subring of Ox has the form Z+ fOx for some 


f). 


Theorem 5.12. There is a one-to-one correspondence between elliptic curves 
E with a given endomorphism ring Of (upto isomorphism), and elements of 
the class group Cl(Of) (i.e. the group of isomorphism classes of projective 
modules of rank one over Of). 


Indeed, if a lattice I’ corresponds to E then I" is an O¢-module such that 
P@eQEK, ie. a projective Os-module of rank one. Conversely, every O f- 
module viewed as a lattice in C determines an elliptic curve C/I with the 
property that End(C/T) is the ring of multipliers of I’, i.e. O;. Therefore the 
number hy of curves (upto isomorphism) with a given endomorphism ring Of 
is finite and its order is equal to Card Cl(Of). 

For each curve there is a canonically defined invariant j(F) corresponding 
to E; if E is written in the Weierstrass form then this is given by 


172893 


5; E:y? = 4a? — gox — g3. (5.4.6) 
G3 — 2793 


j(E) = 
We now consider the case f = 1 in more detail. 


Theorem 5.13 (Weber — Fueter). (a) All the numbers j(E) are algebraic 
integers. (b) If a = j(E) is one of these numbers then K(a) coincides with 
the maximal unramified, Abelian extension of K and G(K(a)/K) = Cl(O,). 
The action of G(K(a)/K) on the set of numbers {j(E)} is transitive. 
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There are precisely nine imaginary quadratic rings Of with f = 1 and hy = 
1, namely Z[/—d], where d = 1,2,3,7,11,19, 43, 67,163. The corresponding 
elliptic curves have rational invariants, which are also algebraic integers in 
view of the Weber — Fueter theorem, hence j(£) € Z. Moreover, the values of 
j(E) are given respectively by: 


j —96 : Se 96 . Be, 0, —33 : 5a: ne ae 9215 : 3°: 
Oe g ety ao Beebe ae, OM BF? «298 909.5407) 


In the general case f > 1 the numbers j(F) are also algebraic integers for 
all E with End(£) = Of, and for all o € Gal(/K) one can explicitly describe 
the action of o on j(£). This description depends only on the restriction of 7 
to K*>, which is represented via the Artin reciprocity law by an idele s € Jx: 


alk» =WK(s), (ve : Je — G(K*/K) — the reciprocity map). 


Furthermore if I” is the lattice corresponding to a curve EF then one can define 
a lattice s-'T: if s = (s»)y (Sy € KX) then s~'T is uniquely determined by 
the condition (s~'T) @ O, = s;'(I' ® O,) for all finite v. 


Theorem 5.14. Let j(s~'I’) denote the invariant of the elliptic curve E' 
defined by E'(C) = C/s-'T. Then one has the following formula for the 
action of 0 € Gal(K/K): 


j(E)? = j(s*P). (5.4.8) 


From this it follows that j(£) € K*». To prove these theorems, one 
considers the action of o € Gal(K/K) on the coefficients of the Weier- 
strass equation (5.4.6). One obtains as a result the following new curve: 
E? : y? = 423 — g§x — g§; therefore j(Z)° = j(E7). Clearly, one has 
End(£’) = End(£) © Of, and thus the set {j(E)’|o € Gal(K/K)} is fi- 
nite and the numbers j(£) are all algebraic. Consequently the curve E can 
be defined over an algebraic number field L. If the restriction of o to L is 
represented by a Frobenius automorphism for some v, 


o|L=FyyjK(v) = Fry, 


then the above formula (5.4.8) can be established using the reduction E mod 
$8, where $B is a divisor in DL which divides p,. Then this formula can be 
rephrased as Hasse’s theorem: 


j(E)P™ = j(py'T), (5.4.9) 


where p C Ox is a prime ideal of Ox defined by the conditions (p, f) = 1, 
Pp = OFM. 

The invariants j(£) therefore generate an extension K,s)/K satisfying the 
property G(K p)/K) = Cl(O;s). However the field K = Ur>1 K(f) does not 
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yet coincide with the whole of K@”, and in order to obtain K?” it is necessary 
to adjoin also to K(1) the coordinates of all points of finite order on some 
elliptic curve E with the property End(£) = Ox. More precisely, let EF be an 
elliptic curve with complex multiplication, i.e. End E @ Q = K, defined over 
a number field L > Kk. Then the image of the Galois representation 


pi: Gr — Aut Vi(E) = GLo(Q;) (5.4.10) 


is an Abelian group which is contained in (Z; @ Ox)*, and the index of Im p; 
is finite and is bounded by a constant independent of |. By class field theory 
the representation factorizes through G, and for each idele s = (s,)y € Jt 
we can define an element p;(s) = pi(o), where o € Gal(K/K) is determined 
by the condition o|L*> = w,(s). It is not difficult to see that there is a unique 
continuous homomorphism ¢ : J, + K™ with the condition e(#) = Nz/K«(«) 
for all « € L* and pz(s) = €(s)Nz,/K,(s1) for all s € Jz and all l. 

The Abelian l-adic Representations (5.4.10) and the action of Gz on the 
invariants j(£) describe explicitly the class field theory of the field K. We see 
also that in the complex multiplication case the group Im p; is Abelian and 
is therefore much smaller than in the general case. 

An analogous theory (in a less complete form) also exists for CM-fields 
(totally imaginary, quadratic extensions of totally real fields) and for Abelian 
varieties of CM-type, i.e. Abelian varieties A whose endomorphism algebras 
End A® Q are totally imaginary, quadratic extensions of totally real fields of 
degree g = dim A (cf. [Shi71]). 


5.4.3 Characters of l-adic Representations 


As we have seen, one can associate to each elliptic curve E defined over a 
number field K a system of l-adic representations p) : Gx — Aut T)(E£) = 
GL2(Z;) on the Tate module T(E). Together (5.4.4) and (5.4.5) give the 


following important formula for the traces of Frobenius endomorphisms: 
Tr p(Fr,) =Nv+1-—N,(£), 


where Nu = Card(k(v)) is the norm of v, N,(E) = Card E,(k(v)) is the 
number of k(v) — rational points on the reduction EF, modulo v. It turns out 
that the values of the character x», = Tr p; form an interesting arithmeti- 
cal function of the argument v. We shall later see that the list of examples 
of this sort is quite rich and contains for example the Ramanujan function 
T(p); the numbers of representations of positive integers by positive definite 
quadratic forms, etc. It is known that the character of p uniquely determines 
this representation provided that p is a semi-simple representation, that is, a 
direct sum of irreducible representations. This semi-simplicity property was 
established for the Tate modules of elliptic curves and Abelian varieties, cf. 
[Fal83], [Fal85], [Fal86], [PZ88]. 
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A remarkable finiteness property 


for the characters x, of continuous finite-dimensional [-adic representations 
was discovered by G.Faltings (cf. [Fal83]) in his proof of the Mordell con- 
jecture: any such character y, is uniquely determined by a finite number of 
values 

Xp(Fry) = Tr p(Fr,), (ve Q, Qa finite set), 


where Fr, denotes the class of a Frobenius element under the assumption 
that p is unramified for all v outside a finite set S of places of K. In this 
situation the representation p factorizes through a representation of the group 
Gs = G(Kg/K), where Kg is the maximal extension of K unramified outside 
S. For each v ¢ S the value x,(F'r,) is therefore well-defined. We shall now 
construct a finite set Q of places of K, QN S 4 9, such that p is is uniquely 
determined by the values x,(F'r,) for v € Q. Let L/K denote the composite of 
all Galois extensions of kK unramified outside S, which are of degree less than 
or equal to 12”". Then by Hermite’s theorem (see §4.1.5), the extension L/K is 
finite. Now we choose an appropriate @ outside S such that the elements F'r, 
fill the whole Galois group G(L/K). The existence of such elements follows 
from the Chebotarev density theorem (Theorem 4.22). We claim that the set 
Q constructed in this way satisfies the conditions of the theorem. 

Indeed, let p; and p2 be two different representations whose characters 
coincide on the elements F'r,,v € Q. Consider the representation 


Pl X p2: Z(G] = M,(Q) x Mn(Qi) 


of the group algebra Z)[G]. Its image M is a Z;-submodule of rank < 2n?. By 
the construction of Q the elements p; x p2(Frv), vu € Q generate M/IM 
as a vector space over F;, and consequently, the whole of M over Z; (by 
Nakayama’s lemma for finitely generated modules over a local ring, applied 
to the ring Z;, see [Bou62], [5Z75]). Now consider the linear form 


f (41,42) = Tr(a1) — Tr(a2) (a1, 42 € Mn(Q)) 
on M. By the assumption we have that 
Xp (Fry) — Xpe2 (Fry), VE Q, 


and therefore f(ai,a2) = 0 on the whole Z;-module M, because f = 0 on its 
generators (p; X p2)(Fr,) vu € Q. Therefore yp, (Fry) = Xp.(Fry), establish- 
ing the theorem, see [PZ88], [Del83], [Sz(e)81]. 


5.4.4 Representations in Positive Characteristic 


Let E be an elliptic curve over a finite field k with q = p? elements. Consider 
its Tate module 
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T, 


»(E) = lim Ker(E 2 E) © Z}, 


m 


where y = 0 or y = 1. In this way we obtain a representation defined by 
Pp : Gal(k/k) — Aut T,(E), 


in which the group Gal(k/k) is a (topologically) cyclic group. 

If T,(£) # 0 then End £ ® Q is an imaginary quadratic extension of K 
and End E = Of for some f > 1, where Of = Z+ fOx is a subring of the 
maximal order Ox of K. 

In this situation one can show that: 


1) the prime p does not divide the conductor f, 
2) p splits in K. 


In the case T,(E) = 0 we have that Dz = End E'® Q is a division algebra of 
degree 4 (a quaternion algebra) over Q which at all primes | 4 p decomposes 
as Dz ® Q; = M2(Q,;). Also, End F is a maximal order in Dg. Curves with 
this property are called supersingular curves. 

In positive characteristic the endomorphism algebra becomes larger when 
there is a Frobenius endomorphism Fy of E, which is a purely inseparable 
isogeny; its kernel and image have only one geometric point over k. In par- 
ticular, if Fy ¢ Z C End E, then also T,(#) = 0. For further information on 
points of finite order in positive characteristic, also in Abelian varieties, see 
[Man61], [La73/87], [Mum74]. 


5.4.5 The Tate Module of a Number Field 


(cf. [Sha69], [Coa73], [Iwa72], [Iw01], [MW83]). The Tate module of the Jaco- 
bian variety Jc of a curve C’ gives a functor from the category of curves over 
a field k to the category of Z;-modules. If k is finite then the field k(C) of ra- 
tional functions on C’ has much in common with a number field. Iwasawa has 
suggested an analogue of the Tate module for a number field kK. The group 
Jim can be interpreted as the Galois group of the étale covering Ci, — C' of 
C’, where C, is the inverse image of C' embedded into J, with respect to the 
morphism 17’. One verifies that the field U,,k(Cm) is the maximal unramified, 
Abelian J-extension of k(C). Its Galois group coincides exactly with the Tate 
module T7(Jc); this gives a reasonable interpretation of the Tate module for 
an algebraically closed field k. However if k is not algebraically closed (for 
example when k is a finite field) then k need not be algebraically closed inside 
the field k(Jjm). In particular the field k( Jim) must contain roots of unity of 
the degree |” since these are values of the Weil pairing. In the case of a finite 
field this is almost sufficient: that is, for a finite extension k’/k we have 


i= Fo (Uni) =U Gn, 
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where (jm denotes a primitive root of unity of degree I” and k is the algebraic 
closure of k. Indeed, the image of Gal(k/k) in T; C GLo,(Z;) is a topo- 
logically cyclic group, whose intersection with the /-Sylow normal subgroup 
S = {g € Gl2,(Z)|g =1 mod /} is an l-subgroup of finite index. Therefore 
on replacing k by a finite extension k’ of degree prime to the characteristic of 
k, the extension k becomes an [-extension of the finite field k’, ie. 


k= (Jk (Gm). 


We see now that for a finite field of constants the Tate module JT; coincides 
with the Galois group of the composite Galois extension 


k(C) ck(C) c AM, 


where k = Umk' (Gm), AO is the maximal unramified, Abelian /-extension of 
k(C). 

Taking this description as a starting point, we may extend the definition 
of the Tate module to the number field case. Let K be a number field, K,, = 
K(Gim), K = UmKm, and AY > K the maximal Abelian, unramified [- 
extension of KX. Further let 


T,(K) = Gal(A /K). (5.4.11) 


Then T;(K) is a projective limit of |-groups (a pro-l-group), and is in particular 
a Z,-module. Iwasawa, who introduced this module (also called the Iwasawa 
module), has shown that V;(K) = T;(K) ® Q; is a finite — dimensional Q:- 
vector space. Using class field theory one can describe T;(K) explicitly. One 
knows that the Galois group of the maximal Abelian, unramified extension of 
a number field L is isomorphic to the class group Cl,. Denoting by oh the 
l-component of this group, one obtains the following description: 


T)(K) =limcl? , 


where the inverse limit is taken with respect to the norm maps of ideals. 

On T;(K) we have an obvious action of the Galois group Gal(K/K) and its 
subgroup [ = G(K/K 1) © Z;. On a class represented by an ideal a € oa 
this action is given by a + a9, (g € G(K/K)), and for the corresponding 
h € Gal(A“ /K) the ideal a9 corresponds under class field theory to g~'hg 
(in view of the equality (4.4.22)). 

Iwasawa regarded T;(A’) as a module over the completed group ring A = 
Zi{[L]] = Zil[T]] (the ring of formal power series over Z;). Just using his 
classification theory for such modules, he obtained the following formula for 
the orders of the groups Cle which is valid for m > mg: 


log, en | = Am+ pl™ 4+ const. (5.4.12) 
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Under some additional assumptions he described explicitly the module 
Ti(Q) for all 1 < 4001. This module turns out to be cyclic, and one can even 
find a generator of its annihilator. Essentially, this generator coincides with a 
product of the l-adic L-functions of Kubota and Leopoldt ([Iwa72], [KuLe64], 
[Sha69], [Kuz84]). 

The validity of the corresponding statement in the general case (the “Main 
conjecture” of Iwasawa theory) was established in 1984 by B. C. Mazur and 
A. J. Wiles [MW83]. According to the main conjecture of Iwasawa theory a 
module of ideal class groups can be described as the quotient of the Iwasawa 
algebra by an explicitly given principal ideal. A later, more accessible proof 
using Kolyvagin’s notion of Euler systems was found by K. Rubin, cf. his 
appendix to [La90]. 

In the works of Ferrero and Washington ([FeWa79], [Fer88], [Wash82]) 
another conjecture of Iwasawa was proved, which says that for each Abelian 
extension /Q and each prime /, the invariant 1 of the module T;(4) vanishes. 

This result implies that T;(K) is a finitely generated Z)- pte Washing- 


ton’s conjecture, according to which the orders of the groups ae stabilize 
in the cyclotomic Z)-extension of an arbitrary Abelian field for J ig # p, was 
proved in ([Wash78]). 

Very recently (cf. [Barsky04]) the vanishing of the Iwasawa jz invariant was 
proved by D.Barsky for all totally real fields. The Iwasawa w- invariant of p- 
adic Hecke L- functions was studied by H.Hida in [Hi02]. 

The methods of Iwasawa have been considerably extended in further re- 
search related to the study of A-modules of various kinds: those arising from 
Selmer groups of Abelian varieties (Mazur modules) see [Man71], [Man76], 
[Man78], [Maz79], [Maz83], [Maz86], and also those arising from elliptic units 
in Abelian extensions of fields of CM-type, cf. [Maz83], [Rob73], and the ones 
arising from Heegner points on modular curves ([Koly88], [Coa73], [Coa84], 
[GZ86], [Rub77]). 

New approaches to proving the main conjecture and its generalizations in 
various situations were discovered by Kolyvagin in [Koly90], who proposed the 
more general concept of an “Euler system”, which makes it possible to deal 
with all known cases from a unified point of view. 


For recent developments on Euler Systems we refer to [Rub98], [Kato99], 
[Kato2000], [MazRub04]. 

Interesting Euler systems could be constructed in some cases using Beilin- 
son elements in Kg of modular curves and the Rankin—-Selberg method, cf. 
[Scho98]. An analogue of the Selmer groups and the groups of Shafarevich- 
Tate were defined in [BK90], [FP-R94] for an arbitrary motive over a number 
field F’, cf. also a new book by B.Mazur and K.Rubin, [MazRub04]. 

We only mention a GL(2) version of Iwasawa theory developed by Coates 
et al., cf. [Coa01], [CSS03]. The GL(2) main conjecture for elliptic curves 
without complex multiplication was described very recently by J. Coates, T. 
Fukaya, K. Kato, R. Sujata, O. Venjakob in [CFKSV]. 


5.5 The Theorem of Faltings and Finiteness Problems in 
Diophantine Geometry 


5.5.1 Reduction of the Mordell Conjecture to the finiteness 
Conjecture 


A major problem in diophantine geometry was the Mordell conjecture, now a 


Theorem 5.15 (Faltings [Fal83]). If X is a projective algebraic curve of 
genus g > 2 defined over a number field K and L/K a finite extension then 
X(L) is finite. 


Note that prior to the work of Faltings this was not known for any curve X. 
However Siegel’s Theorem was known, the strongest finiteness result until 
Faltings: 


Theorem 5.16 (Siegel). If X is an affine curve of genus g > 1 defined over 
the ring of integers O C K and Og C O is any subring of S-integral elements 
(S finite) then X(Og) is finite. 


Here Og C K denotes the subring 
Os = {a € K | Vu € S,v non-Archimedean, |z|, < 1}, 
where S C Val(ix) is a finite set of valuations of K. 


The starting point for research leading ultimately to the (first!) proof of 
Mordell’s conjecture by Faltings, was the following pair of conjectures, pro- 
posed by I.R.Shafarevich [Sha62], on classification the problem for algebraic 
curves of genus g > 1 over an algebraic number field Kk, with a fixed set S' of 
bad reduction points: * 

Let IlI(g, K,S) be the set of (K-isomorphism classes of) algebraic curves 
X of genus g(X) > 1, defined over a number field AK with bad reduction 
contained in a finite set S C Val(ic). When g = 1 we assume in addition that 
X(K) #90. 


I) Finiteness Conjecture. Assume that g > 2 (or that g > 1 and X(K) #9). 
Then for any given g, K,S these exist only finitely many such curves (up 
to isomorphism). This finite set will be denoted by I (g, K, S). 


For the second proof of Mordell’s conjecture by Bombieri-Vojta we refer to 
[Bom90], [Voj91]. This entirely new proof is based on methods from Diophan- 
tine approximation and arithmetic intersection theory. Faltings in [Fal91] simpli- 
fied and extended these methods to prove two longstanding conjectures of Lang 
concerning integral points on Abelian varieties and rational points on their sub- 
varieties, and E. Bombieri subsequently further simplified the arguments to give 
a comparatively elementary proof of Mordell’s conjecture. 
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II) Bad Reduction Conjecture. If S = and K = Q then these exist no such 
curves, ie. HI (g, K,S) = 0. 


These problems generalize theorems of Hermite and Minkowski in the the- 
ory of algebraic number fields (cf. §4.1.5). It was shown by A.N.Parshin in 
[Par72], [Par73], how to reduce Mordell’s problem to the finiteness conjecture. 
His remarkable construction is given below. The Shafarevich Conjecture and 
the related Tate conjecture were proved by Faltings [Fal83] (see also Deligne’s 
Bourbaki talk [Del83]). 

A detailed exposition of all these questions can be found in the survey 
[PZ88]. 


The construction of A.N.Parshin consists in constructing of a map 
a: X(K) > IU (g’, kK’, $’) (5.5.1) 


for some other data of g’, K’,S’ with the property that the fibers of the 
map (5.5.1) are finite. The image of a point P € X(K) is a certain curve 
Xp € Ill (g’, K’, S’), which is constructed in several steps. 


1) Let us map the curve X into its Jacobian J using the Abel map (5.3.52): 
pp: X — J, and consider the multiplication by 2 morphism 27: J — J. 
We define an auxiliary curve X1 as the inverse image of X under this map 
(this is an example of the fiber product): 


X;,— X 


ios 


J— J 
25 


The curve X, is smooth, it is defined over the same field K, and its 
genus gi can be computed using the Hurwitz formula (5.1.1). We have 
2g, — 2 = 279(2g — 2), because X,; —> X is an unramified covering of 
degree 279 = CardJ2. The inverse image of the point P is then a rational 
divisor D = Dp on X, of degree 279. 

2) One constructs a covering Xp — X, of degree 2, which is ramified only 
over the points in D. Such a covering exists, it has genus g’ = g(Xp) 
which is also computed by the Hurwitz formula 


2q' — 2 = 2(2g, — 2) + 279 = 27941 (2g — 2) +29, 
that is 
g = 27941 (g —1) +2791 4-1. 


One checks that the curve Xp is defined over an algebraic number field 
K' > K, [K' : K] < ~, which depends only on the data g, K,S, but 
not of the individual point P € X(K). Moreover, the curve Xp has a 
good reduction over the set S’ of non-Archimedean points of AK’, lying 
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over S and over the prime 2. The construction of Xp is analoguous to 
the construction of a 2-covering attached to a rational point on an elliptic 
curve (see (5.3.37). In the same way one proves that the reduction of the 
resulting curve is good over S$” (lying over S U 2). Then one deduces the 
finiteness of the degree of K’/K using the Theorem of Hermite. 


The proof of the fact that the fibers of the map (5.5.1) are finite, is purely 
geometric, and it belongs rather to the theory of Riemann surfaces. Indeed, the 
resulting map Xp — X is ramified exactly over exactly one point, namely P. If 
there were infinitely many points P such that the curves X p were isomorphic, 
say to a fixed curve Y, we would have infinitely many maps Y — X of curves of 
genus > 2, ramified over different points of X. Amongst these maps there are 
infinitely many non—isomorphic, since the group of analytic isomorphisms of a 
Riemann surface of genus > 2 is finite (its order is bounded by < 84(g—1), see 
[Hur63], [Maz86]). This leads to a contradiction with the classical theorem 
of de Franchis: for any closed Riemann surface Y there exists only finitely 
many non-constant maps f : Y — Z into closed Riemann surfaces Z of genus 
gz = 2 (upto isomorphism), see [dFr13], [Sev14]). 

Note that the theorem of de Franchis is itself a special case of an analogue 
of Mordell’s conjecture over function fields (namely, over C(t)). This version 
of Mordell’s problem was solved in [Man63a], [Man63b]. 


5.5.2 The Theorem of Shafarevich on Finiteness for Elliptic Curves 


(cf. [Sha65], [Se68a]) The finiteness conjecture was proved by I.R.Shafarevich 
[Sha65] for hyperelliptic curves as a corollary of the Theorem of Siegel on the 
finiteness of the number of S-integral points on an affine algebraic curve of 
positive genus. We shall give the proof of this result in the case of elliptic 
curves. Let us write the curve X = EF in the Weierstrass form: 


E:y’ =40°—gsa—93 (95,93 € K) (5.5.2) 


We next note that is the curve E has a good reduction outside of S’,, then its 
equation could be reduced to the following form 


E:y? = 423 — gox — 93 with A = g3 — 2793 € OS 


(it is assumed that the finite set S contains the primes over 2 and 3, and S 
is also chosen large enough so that Og is a P.I.D.). Indeed, if v ¢ S, then the 
curve FE can be led over the local ring O, to the form 


E:y? = 42° — gaye — 93 with 92,09 93,» € OvNK, A, € OX. (5.5.3) 


By the uniqueness property of the Weierstrass form (5.5.2) one can choose an 
element u, € K*, such that 


4 67 12 al 
92,0 = UyG2,  93,v = Uy 93> Ay = Uy A ’ 
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and we may assume that u, = 1 for almost all v. 
As the ring Og is a P.I.D., there is an element u € A™ the equation (5.5.2) 
of the curve takes the form 


E: y? = 423 — gox — g3 
where 
g2= U's, gg =uegs, Aud’, 


and it follows that A € Og. 

Now we can miltiply A by any number u € (0%)! keeping fixed the 
isomorphism class of the curve. It follows from a version of Dirichlet’s unit 
theorem (on S-units, see §4.1.6) that the group (Of)/(O%)'? is finite. There 
therefore exists a finite set M C OZ such that any elliptic curve of the given 
form can be reduced to the form (5.5.3) with g; € Og, A € M. On the other 
hand, for a given A the equation 


U3 —-27V2=A 


is an affine curve of genus 1, which has only finite number of solutions in Og 
by the Theorem of Siegel (see [Sie29], [La60], [Mah34]). 

The same idea is used in the proof of the semisimplicity of the Tate module 
of an elliptic curve (see [Se68a]). 


5.5.3 Passage to Abelian varieties 


In order to prove the Shafarevich conjecture for an arbitrary curve of genus g > 
1 over a field K, whose bad reduction points belong to S, one associates to X 
its Jacobian variety A = Jx, endowed with a canonical principial polarization 
0, defined over the same field K. It is known that A has good reduction outside 
S,and X is determined by the pair (A, 0) due to the Theorem of Torelli, see 
[Wei57]. Let us prove that the number of K-isomorphism classes of curves 
X having the same the couple (A,9) is also finite over the base field K. 
For this one fixes a natural number m > 3, and one consideres the extension 
K(A,,)/K, obtained by adjoining to K the coordinates of all points of order m 
on A. The extension K(A,,)/K is then unramified outside SU {divisors of m}, 
and all extensions of the form K(A,,)/K have a bounded degree, they are all 
contained in a finite extension K’ by the Theorem of Hermite (cf. (§4.1.5)). 

Let us prove that the set K’-isomorphism classes of such curves is finite. 
If o € Gal(Q/Q), and ¢ is an isomorphism of curves with the same Jacobian, 
preserving the polarization, then a = y? oy~! induces the identity on A). It 
is well-known that then the morphism a is identical (see, for example, |[La62]): 
the matrix T;(A) € AutT)(A) = GSP,(Z,) with coefficients in Q is a unitary 
matrix with respect to the involution a@ + a? determined by the polarization 
(the Rosatti involution, i.e. aa? = 1. The characteristic roots w; of the matrix 
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T;(a) are then algebraic integers w; whoses absolute value is equal to one for 
all Archimedean valuations, i. e. w; are roots of unity. By the assumption, 
a—1= m6 for some 6 € EndA, that is, w; — 1 = mG;, where (; are algebraic 
integers, implying w; = 1. 

In order to pass from K’ to K, one can use the fact that the num- 
ber of K-forms of a curve X, isomorphic to X over K’, is finite. In fact, 
such forms are classified up to K-isomorphism by the finite cohomology set 
H'(G(K'/K), Aut (X)), see [Se64]. 

We have reduced the Shafarevich conjecture for curves to an analogu- 
ous statement for Abelian varieties, more precisely to the finiteness of the 
set mm) (9, K ,5) of the K-isomorphism classes of pairs 6, where A is a g- 
dimensional Abelian variety over K with good reduction away from S, and 
@ is a polarisation of degree 1, defined over K. As we have already seen in 
section 5.3.5, the C-isomorphism classes of pairs (A, 6) correspond to points 
on the Siegel modular variety A,(C) = H,/Sp,(Z), which is a quasiprojective 
normal variety, and could be defined over Q. 

Another key idea, proposed by A.N.Parshin for solving Mordell’s problem, 
was to associate to elements of m9), (g, K, S) certain points of the set A,(K), 
and then to prove that all such points have bounded height in some projective 
imbedding of the variety A,. Note that under this correspondence the map 
m9), (g, K, S) — A,(K) is not injective, since C-isomorphic pairs (A, @) need 
not be K-isomorphic. However, it is easy to check that the above map has 
finite fibers: for a field K’, considered above , the corresponding map 


m9? (g, K’, S’) > Ag(K’) 


is already injective, and an analoguous argument shows that the fibers of the 
map 


TY (g,.K, 8) > WIS),(g, K’, 8”) (5.5.4) 


are also finite (the theorem of finiteness for forms of an Abelian variety). 

Consideration of Abelian varieties A = Jy rather than curves X was a 
very fruitful idea: to each Abelian variety A on can attach its Tate module 
T,(A) (lis a prime), regarded as a module over Gx. If \: A > B is an isogeny 
over K, then one checks that the corresponding maping 


Ti(A) : Ti(A) > Ti(B) 


is an isomorphism of Gx-modules, so A and B have the same sets of bad 
reduction places. Therefore, the finiteness of mY, (g, K, S') follows from: 


I) The finiteness of the number of isomorphism classes of Gx-modules 
M2 Ze which arise as Tate modules 7;(A) with given g, K,S. 

II) The finiteness of the K-isomorphism classes of pairs (A, 6), for which 
the Gx-module T;(A) is isomorphic to a given module M. 
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Recall the property observed first by Faltings that the character of a (conti- 
tuous) representation p: Gx — AutV;(A) is determined by its values on a 
finite set of elements Fr,(v € Q), Card(Q) < 00, QN S = 9, which depends 
only on given g, K,S (cf. §5.4.3). 

By a theorem of A.Weil for Abelian varieties over finite fields (cf. §5.4.1 
which generalizes the theorem of Hasse for elliptic curves (cf. §5.1.3), the num- 
ber | Tr(p;(Fr,)) | is an integer not exeeding 2gVNv. We conclude that for 
every prime / there are finitely many possibilities for characters of represen- 
tations p). 


Thus statement I) reduces to proving the semisimplicity of the G~-module 
Vi(A): for any Q;-subspace W in V;(A), which is a Gx-submodule, there exists 
an endomorphism u € EndA ® Q such that u? = u and uV;(A) = W, so that 
(1 — u)Vi(A) is a Gx-invariant subspace of W in V;(A). 

In turn the proof of the statement IT) also splits into the following steps: 


1) Let us consider the set I 4y(g, K, S') of K-isomorphism classes of Abelian 
varieties as above (but without polarization) with given g, K,S. Then all 
the fibers of the mapping 


1H G)-(g, K,S) > Wav(g, K, S) (5.5.5) 


are finite. 

2) Tate’s conjecture on isogenies. For an arbitrary homomorphism A: A > B 
of Abelian varieties over K, consider the corresponding mapping V;(A) : 
Vi(A) > V,(B). Then 
— if the Gx-modules V;(A) and V;(B) are isomorphic then the varieties A 

and B are isogenous over K; 
— the natural mappings 


EndA ® Z; — EndT;(A) and EndA®Q;— EndVj(A) (5.5.6) 


are bijective. 
3) The finiteness theorem for isogenies. The set of K-isomorphism classes of 
Abelian varieties B over K, for which there exists an isogeny A — B, is 
finite. 


Statement 1) is not difficult and it reduces to showing that the number of 
polarizations of degree 1, defined over kK (upto a K isomorphism of polarazed 
varieties) is finite. This is deduced as follows: let us view a principial polariza- 
tion @ is an isomorphism @ : A — AY = Pic®(A). An isomorphism \: A > A, 
compatible with polarizations 6, and 02 of A, gives rise to a commutative 
diagram 
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A Ay 


“i 


AA 
r 
that is, 
0, =X’ of2 0X. (5.5.7) 


Let us fix an isomorphism 09 : EndA & EndAY. Then the mapping \ + AY 
becomes the Rosatti invoution p, and all the automorphisms of the form 69, 19, 
will be then invariant under p : (69 '0;)? = 05 '0;. Moreover, the following 
equality holds 09 ‘(AY 0 6; 0 A) = 4? 0 65106; 0 X. Hence the equality (5.5.7) 
takes the form 


051061 =? 0(05'062)02X (5.5.8) 


This equality shows that our statement is analogous to the assertion on the 
finiteness of the number of classes of integral unimodular quadratic forms up 
to the integral equivalence. More precisely, this fact can be stated as follows: 
if E is an order in a semisimple algebra F ® Q with an involution p, then the 
group E*, acting by the formula (x,h) -— «?hx (a € E*) on the set of all 
Hermitian elements (h? = h) of E with a fixed norm, has only a finite number 
of orbits. In our case we use £ = EndA, and we use the semisimplicity of the 
algebra EndA ® Q. 

Properties a) and b) of 2) are then reduced to the semisimplicity property 
of the module V;(A) applied to the varieties A? and A x B. 


5.5.5 Reduction of the conjectures of Tate to the finiteness 
properties for isogenies 


A fruitful approach to the proof of the theorem on semisimplicity and of the 
conjectures of Tate was developed by Yu.G. Zarhin [Zar74] in the years 1974- 
77. He showed that for any ground field these properties could be deduced 
from the property 3) (which is sometimes known as Conjecture T) using a 
“unitary trick” for Abelian varieties discovered by himself. Using this trick 
Zarhin proved the Tate conjecture over global fields of positive characteristic. 
Note that over finite fields these conjectures were proved by Tate himself 
[Ta66]. 
It suffices to show that there is an isomorphism 


Endx (A) ® Q; > Endg,,(Ti(A) ® Q;) 


(induced by a natural map from the left to the right). Consider a non- 
degenerate scew-symmetric pairing (see also in section 5.3.5): 


e? : T)(A) x Ti(A) > Z,(1) = lim pum, 


mm 
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attached to an ample divisor D over A. We choose a maximal isotropic G'x- 
submodule W in T;(A) ® Q;, and let W,, be the image of T;(A) MW in the 
quotient module 


Ti( A) /t™T;(A) = Am = Ker(A SS A). 
There is a commutative diagram 


A—2> A/Wm = Am) 


NE 


A 


It follows from Conjecture T that infinitely many of the varieties A(,,) should 
be K-isomorphic. Let us denote by 


Vm : Atm) + A(mo) 
be a fixed isomorphism. Now consider 
Um = pie 0 Vm Am, € End« (A) ®Q; 


and define 
u= lim um € Endx(A) ®Q. 


It is easy to check that we can recover the Gx-submodule W as the image of 
Ti(u): 
Ti(u)(Ti(A) ® Qi) = W. 


We claim that for an arbitrary Gx-submodule in Endg,,(Ti(A) ® Q;) there 
exists an idempotent u € Endx(A) ® Q;, u? = u, such that 


u(Ti(A) ®@ Qu) = W. (5.5.9) 


This can be established by considering the variety A®, and by constructing 
a certain maximal isotropic Gx-submodule W“) € Endg, (Tj(A®) @ Q;) 
attached to W. Then we apply to W“) the assertion already proved. 

Consider the coordinate projections p; : AS —> A (i= 1,2,--- ,8), and let 
Dea (BD), DOS pie D; be the divisors on A’. We choose a, b, c,d € Q; 
satisfying a? + b? + c? + d? = —1, and define 


—b —c-—d 
ade 
= —da b 
c—ba 


aooas 


It is easy to check that ‘I - J = 14, where 1,4 is the identity matrix. Consider 
TI as an element in Endg, (T;(A*) ® Q;) and put 
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W, = {(z, Iz) |x € W*}, 


W2 = {(z, Ix) |x (w')t}, 


where (W*)+ is defined by the scew-symmetric form e4 attached to the divisor 
D). Then we have that WiNW2 = {0}, Wi, W2 are orthogonal with respect 
to the pairing eg associated to the divisor D‘). The G-submodule 


W®) = W, + We C Endg, (Ti(A8) ® Q)) 


is a maximal isotropic submodule with respect to the pairing defined by eg 
which satisfies all desired properties. This arguments show that there exist 
elements u,,--- ,ug € Endx(A) ® Q such that 


8 
S- ui(Ti(A) ® Qi) = Wi + We. 


i=l 


The right ideal in End, (A) ® Q; generated by u1,--- ,ug can be gener- 
ated by a single idempotent element u because this algebra is known to be 
semisimple (see §5.3.5). This element exactly satisfies our requirement (5.5.9). 


5.5.6 The Faltings—Arakelov Height 


In the previous section we reduced the Mordell conjecture to Conjecture T 
on the finiteness of the number of K-isomorphism classes of Abelian varieties, 
which are K-isogenous to a given variety A. The proof of Conjecture T uses a 
certain canonical height h(A) of A over K introduced by Faltings using ideas 
of Arakelov [Ara74a]. Its principal properties are: 


Finiteness Principle. For given g, K and a real number b the number of K- 
isomorphism classes of Abelian varieties A over K with the condition h(A) < 6 
is finite. 


Boundedness under isogenies: there exists a constant c such that for all 
K-isogenous Abelian varieties A and B one has |h(A) — h(B)| < ce. 


In order to define the height h(A) consider first a one-dimensional vector 
space L over K endowed for all places v of K by a v-adic norm ||-|| : & — R, 
satisfying the condition ||As||, = |Aly||s||v, where ||As||, is the normalized v- 
valuation of an element A € K*. Suppose that for s € L\{0} the equality 
||As||y = 1 holds for almost all v (i-e. with possible exlusion of a finite number 
of them). In view of the product formula (4.3.31) we have [],,|A|, = 1 for 
A € K*, therefore the product [],, ||As||, is independent on a choice of ||s||,. 
The degree of L is defined by the formula: 


deg L = —log |] |ls|lv. (5.5.10) 
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Let O C K is the maximal order. Defining norms || - ||, for all finite v is 
equivalent to defining of an integral structure, or an Ox-form Lo, of L, that 
is, to defining of a projective Ox-module of rang 1 such that Lo, ®o0, K = L. 
In order to define Lo, we put 

Lox = {8 € L| |lAs||o <1,v a finite place of kK}. 


Conversely, for a given Ox-module of rank 1 Lo, C L with the property 
Lox 0% K = L we define the norm || - ||, using the isomorphism of vector 
spaces Lo, 0, Ky = Ky which takes Lo, ®o, Ox to Ox (for a finite v). 
If s € Lo, \{0} then Ox - s is asubmodule of Lo, and 


Card(Lox/Ox -s) = [J Isllv*. 


v foo 


Consideration of Archimedean metrics ||s||,, is a convenient replacememt of 
the notion of an integral structure. Defining this metric is equivalent to giving 
a Hermitian form (-,-), on the one-dimensional complex vector space L, = 
L®x,o C for all embeddings o : K — C associated with Archimedean places 
v. We have that 


IIsll. = (s,8)4/?, if Ky & R; (5.5.11) 
IIslle = (8, 8a; if Ky = C. 


For an Abelian variety A over K, we let w(A) = 927,[A] denote the one- 
dimensional K-vector space of regular (algebraic) differential forms of maxi- 
mal degree g on A where (g = dim A). For a number field K there is a natural 
v-adic norm || - ||, on w(A) defined as follows. 


a) For non-Archimeadean places v the norm || - ||, is defined using the theory 
of Néron which makes it possible to define a minimal model Ao, of A over 
O, and a one-dimensional O,, - module w(Ao, ) endowed with a canonical 
isomorphism 


w(Ao,) 80, K = u(A)® Ky. (5.5.12) 


The norms are those corresponding to the Ox-module w(A)o,. 
b) For a Archimedean place v given by an embedding o : K — C the norm 
|| - ||v is defined using the Hermitian form 


1 = 
(a, B)o = ns iz an B, (5.5.13) 


on w(A), = Ww ®xK C, where a A # is a 2g-dimensional differential 
form, which is integrated against the (topologically) 2g-dimensional vari- 
ety A(C). 
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In terms of these norms on w(A) the height h(A) of the variety A is defined 
as follows (see the equality (5.5.10)): 


1 
h(A) = —— deg w(A). 5.5.14 
(A) = egy deel) (5.5.14) 
For example, if K = Q and ais one of the two generators ta of a Z-module 
w(A)z (the Néron differential) then 


1 1 Zs 
h(A) = ~5 les (ae he la A =) ; 


The proof of the finiteness principle for the height h(A) may be broken up 
into the following steps: 


1) Reduction to a finiteness statement for Abelian varieties with a princi- 
pal polarization. This can again be achieved using the “unitary trick” of 
Yu.G.Zarhin, who proved that for an abelian variety A over K and the 
dual variety AY there always exists a principal polatization on the variety 
A* x (AY)* (see [Zar85]). 

2) The study of the moduli spaces A,/Q of K-isomorphism classes of pairs 
(A, @) where @ is a principal polarization of A. Recall that A, is an affine 
line defined over Q parametrizing elliptic curves by means of the elliptic 
modular invariant (5.3.16). In general, A, is a normal algebraic variety 
of dimension g(g + 1)/2, which is not compact for g > 1, and the struc- 
ture of its various compactifications is rather complicated [Fal85]. A pair 
(A, @)defined over K produces a point J(A,@) € A, and one can define 
heights of various projective embeddings of the variety A defined over Q. 

3) A canonical projective embedding of A, is constructed using Siegel modular 
forms; these forms may be viewed as global sections of certain line bundles 
(more precisely, powers of the canonical line bundle) on Ag, cf. [MZ72], 
[PZ88], [FW84]. The corresponding height of a point J(A, @) is called the 
modular height. A key observation of Faltings was that the height (5.5.14) 
and the modular height are essentially equal. Thus the finiteness principle 
follows from the basic property of heights of points in a projective space: 
there is only finite number of isomorphism classes of pairs (A,@) over K 
with bounded modular height of the corresponding points J(A, 0) (comp. 
with §5.2.5). 


The above three steps give only a hint of the strategy of the proof of the 
finiteness principle; carrying out this program in detail is a quite technical 
task, see also the review of B. Mazur [Maz77]. 


5.5.7 Heights under isogenies and Conjecture T 


By the finiteness principle of 85.5.6, Conjecture T and the Mordell conjecture 
follows from the boundedness |h(A) — h(B)| for K-isogenous varieties A and 
B. 
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The proof of this fact is deduced from theorems I and II given below. 


Theorem I. Let p be a prime number unramified in kK. There exists a finite set 
M = M(K,p,g) of primes such that: if A is an Abelian variety of dimension 
g over K with good reduction at all places of K dividing p, and S(A) is the set 
of all primes divisible by the bad reduction places of A, then for each isogeny 
A-— B of the degree not dividing any prime in M we have that h(A) = h(B). 

In particular, for an Abelian variety A over K there exists a finite set M 
of primes such that there is only a finite set of Abelian varieties B over K 
(up to K-isomorphism), which admit a K-isogeny A — B of the degree not 
dividing primes in M. 


Theorem II. Let A be an Abelian variety over K, | is a fixed prime number. 
Then the set of Abelian varieties B (up to K-isomorphism) which admit a 
K-isogeny of degree |" (m > 1) is finite. 


The proofs of both theorems are based on explicit formulas for the be- 
haviour of h(A) under isogenies. In the good case (Theorem I) the height 
does not change because in this case one can define an isomorphism of the 
corresponding modules with metrics w(A) and w(B) of the same degree. 


The proof of theorem II proceeds by reductio ad absurdum. Suppose that there 
is an infinite sequence of K - isogenous Abelian varieties 


A Ba) > Bay > +++ > Bary > Bantry > - + 


such that the kernels W, = Ker(A — Bi,)) form an /-divisible group. Using 
known results on the structure and properties of [-divisible groups, one proves 
that the sequence h(B,,)) stabilizes from a sufficiently large no, and theorem 
II follows by applying the finiteness principle for the height. 


The conjecture T follows from theorems I and II, since every isogeny A — 
B can be decomposed into a composition of isogenies 


A Bo) > Ba) > Bay > +++ > Bay = B, 


satisfying the following conditions: 


a) the degree of A — Bio) does not divide any prime in the finite set M U 
S(A) = {li,l,--+ In} of theorem I; 
b) the degree of By;_1) + By) is a power of a prime |;. 


According to Theorem I there are only finitely many possibilities for the 
variety Bio) (up to K-isomorphism). Applying theorem II to the variety Bio) 
and to the prime /, shows that there are only finitely many possibilities for 
the variety Bi). By induction, there are only finitely many possibilities for 
B= Brn) (cf. [PZ88], pp.383-384). 


5.5 The Theorem of Faltings and Finiteness Problems 259 


This completes the proof of the Mordell conjecture, as well as the conjec- 
ture T, the finiteness conjecture of Shafarevich, and the conjecture of Tate for 
Abelian varieties. 


In 85.5.1 we have mentioned the second conjecture of Shafarevich. 

This conjecture can be reformulated as a statement on the non-existence 
of certain smooth proper schemes over SpecZ of relative dimension 1 and of 
genus > 1. Of course, the classical geometric analogue of this conjecture is 
well known (and discussed in [Sha62] as a motivation). 

Moreover, in the 80th J.-M. Fontaine [Fon81] and independently V.A. Ab- 
rashkin (see in [PZ88]) proved that over the maximal orders in Q, Q(V—1), 
Q(V-2), QV—3), Q(/—7), Q(V2), Q(V5), Q(W1) there exist no smooth 


proper Abelian schemes. 
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Zeta Functions and Modular Forms 


6.1 Zeta Functions of Arithmetic Schemes 


6.1.1 Zeta Functions of Arithmetic Schemes 


(cf. [Sha69], [Se65]). Let X be a scheme of finite type over Spec Z (see §5.1). 

Then the closed points x € X are those which satisfy the condition that the 

corresponding residue field R(x) is finite. The cardinality of R(x) is called the 

norm of x and is denoted by N(x). The set of all closed points of X is denoted 

by X. For the moment we shall think of this as a discrete topological space. 
The zeta function of X is defined to be the Euler product 


¢(X,s) = JJ Q-N(a@)*)"?. (6.1.1) 
rEX 
In the case X = Spec Z definition (6.1.1) leads to the Riemann zeta function 
¢(s) in view of Euler’s identity: 


¢(s)= So n° = ]]Ja-p)*. (6.1.2) 


Pp 


For an arithmetic scheme there are only finitely many points with a given 
norm, so the product (6.1.1) is a formal Dirichlet series )>>°_, ann~* with 
integral coefficients. 


Theorem 6.1. The product (6.1.1) is absolutely convergent for Re(s) > 
dim X, where dim X is the dimension of X (see §5.2.1 of Chapter 5). 


The proof of this fact can be reduced to the following special cases: 
(a) X = Spec Z[T,,--- ,T,]. The product then takes the form: 


¢(X,8) = [[Q-p**)* = ¢(s—n); (6.1.3) 


Pp 
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(b) X = Spec F,[Ti,---: ,T,]. We then have 
((X,s) = (-pry?, (6.1.4) 


Equation (6.1.3) is implied by (6.1.4), which can in turn be obtained from the 
following calculation of the number of closed points of an arbitrary variety X 
over a finite field F, (q =p’). 

Let nz = Card{a € X | N(a) = q* } be the number of closed points with 
norm q* and 1 = Card X(F,:) the number of geometric points with values in 


F,:, ie. the number of morphisms Spec Fy > X. 
Lemma 6.2. The numbers 4 and nx are finite and are related by the following 
formula 


= > _ knp. (6.1.5) 


This fact is implied by the observation that for a given x € X there are 
precisely //k field embeddings Fj. — R(2). 
We now obtain the following identities: 


Co 


¢(X,s) = ]JaQ-a")™, (6.1.6) 


k=1 


log ¢(X, s) SSNs log(1 — q-**) 
k=1 


=- mE = (Be rh arte 
f=1 


k=1m=1 l=1 \ kl 


(Here we replaced the variable | by km, taking (6.1.5) into account.) 
If X = Spec F,[Ti,--- , Tn] then 4 = q'” (the number of points of the 
affine space A” over F,.). Hence for q = p we have 


n — pl” —Is = P a n—s 
log¢(AP,8) = — >) -p* =— ))— — = —log(—p"™*), 
l=1 I=1 


establishing (6.1.4). 

In both cases (6.1.3) and (6.1.4) for Re(s) > dim X the product in (6.1.1) 
converges absolutely. 

Similarly we see that (cf. [Sha69]) 


n 


c(PR,,s) = []a-pe™)y 


m=0 


(2.8) =] [J a@-p &™) 1 = T] cs-m. (6.1.7) 
p m=0 
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6.1.2 Analytic Continuation of the Zeta Functions 


It is thought that the functions ¢(X,s) can all be analytically continued onto 
the entire s—plane C. The validity of this conjecture has been verified for 
many varieties. However in the general case only the following weaker result 
is known. 


Theorem 6.3. The function ¢(X,s) has a meromorphic continuation to the 
half plane Re(s) > dim X — $. 


The singularities of ¢(X,s) in the strip dim X — 5 < Re(s) < dim X are 
described by the following theorem: 


Theorem 6.4. Let us assume that X is irreducible and let R(X) be the 
residue field of its generic point. Then 


1) If Char R(X) = 0 then the only pole of ¢(X,s) for Re(s) > dim X — } is 
at the point s = dim X, and this pole is simple. 

2) If Char R(X) = p > 0 and q is the highest power of p such that R(X) 
contains F, then the only singularities of the function ¢(X, s) for Re(s) > 
dim X — s are simple poles at the points 


2rin 


s=dim X + (n € Z). (6.1.8) 


log q 
Corollary 6.5. For each non-empty scheme X the point s = dim X is a pole 
of ¢(X,s), whose order is equal to the number of irreducible components of X 
of dimension dim X. 


Corollary 6.6. The domain of absolute convergence of the Dirichlet series 
¢(X, s) is the right half plane Re(s) > dim X. 


Theorems 6.3 and 6.4 are deeper than Theorem 6.1. Their proof is based 
on the analogue of the Riemann Hypothesis for curves X over F, established 
by Weil (cf. [Wei49]), see also [Se65], [Nis54], [La Wel]. 


6.1.3 Schemes over Finite Fields and Deligne’s Theorem 
If X is a scheme over F, then for all x € X the field R(z) is a finite extension 


of Fj. Hence N(x) = q*°8* for some number deg z, called the degree of x. In 


studying ¢(X,s) in this case it is convenient to use the new variable t = q~*. 


We write 
C(X, 8) = Z(X,q-*), (6.1.9) 


where Z(X,q~*) is the power series given by the product 
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Z(X,t) = [[a-eee*)7}. 
LEX 
If np = Card{x € X | dega = k} and 4 = Card X(F,:) then we have 
seen that 
log Z(X,t) =- Sony, = S_kne, (6.1.10) 
hl 


hence 


Xt) . 
—— = t-tlog Z7(X,t) t. 1.11 
BRD d 0g =m (6 ) 


Equation (6.1.10) is often taken as the definition of the zeta function, and one 
writes 
Z(X,t) =exp | S— Card X(Fy)> 


l=1 


(6.1.12) 


A remarkable property of the zeta function Z(X, ¢) is its rationality in the 
variable t. This was first established by B.Dwork in [Dw59]. The rationality 
statement has a direct arithmetical interpretation: the numbers 1, i.e. the 
numbers of solutions of a certain system of algebraic equations in finite fields 
must satisfy a recurrence relation of the type: 


n-1 
Yin = y TWANG 
i=0 


for sufficiently large 1, where n and the 7; are certain constants. It is easy 
to check that the rationality of the function Z(X,t) is also equivalent to the 
existence of finitely many complex numbers a;, 3; such that 


=> 8 -Soal (6.1.13) 
) t 


for all 1 > 1. Indeed, this is obtained from the logarithmic derivative of the 
identity: 
1 it 
Z(X,t) = Ti ~ ait) 
IT, — Bt)’ 


taking into account (6.1.11). 

A fundamental role in the study of Z(X, t) is played by the fact the number 
vz can be represented as the number of fixed points of a certain map F* acting 
on the set of geometric points X(F,). 
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Definition 6.7. The Frobenius morphism F : X — X of a scheme X over 
F, is defined on each open affine subscheme Spec A C X using the ring 
homomorphism a +> a‘; on the topological space X the morphism F acts as 
the identity map. 


The consequent self-map of sets of geometric points (not the same thing!) 
is written 


F: X(F,) > X(F,) (6.1.14) 


and the set X(F,x) coincides with the set of fixed points of F* : X(Fy) > 
X(F,): a point y € X(F,«) is represented by a morphism y : A — F,, where 
Spec A is an open affine subset containing y(Spec F,). The point F*(y) is 
defined by the homomorphism 


fof” (fe A). 


We see that the condition y € X (Fs) is equivalent to saying that Im y C 
Fx C Fy and y(f) = yf)? because Fx = {x € Fy | c= aot}. 

The rationality of the zeta function was a part of a series of conjectures 
stated by Weil in 1949. Dwork’s proof of the rationality was a significant step 
towards proving these conjectures in general. The final step was made by 
Deligne in 1973 who proved the so called “Riemann Hypothesis” for algebraic 
varieties X/F,, cf. [Del74]. 

For a smooth projective variety X over F, of dimension d the Weil con- 
jectures can be stated as follows: 


W1) Rationality: 


P(X) gal) 


Z(X,t) = Pol Xt) ni Poa) 


(6.1.15) 


where d = dim X and P,(X,t) € C[¢] for all r = 1,2,...,2d and 
P,(X,0) =1. 
W2) Integrality: 


P(X) =1=t, Peal Xt) = 1-¢%t, (6.1.16) 


and for r=1,2,...,2d we have that P,(X,t) = [](1—w,:t), where w,.;t) 
are certain algebraic integers. 
W3) The Functional Equation: 


Z(X,1/q*t) = +q%/tXZ(X, t), (6.1.17) 


where x is the Euler characteristic of X, which can be defined purely 
algebraically as y = (A- A) (the self—intersection number of the diagonal 
AcxX xX). 
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W4) The Riemann Hypothesis: The absolute value of each of the numbers 
w,4t and their conjugates are equal to gh 

W5) Degrees of polynomials P,(X,t): If X is the reduction of a smooth 
projective variety Y defined over a number field embedded in C, then the 
degree of P,(X,t) is equal to the r‘® Betti number of the complex variety 
Y(C). 


In the case when X is a smooth projective curve over F, these properties 
were established by Weil, and in particular we have that 


L(t) 


Aas enero) 


(6.1.18) 
29 
where L(t) = [c —w,t) € Z[t] (g is the genus of the curve X), and |w;| = 


i=1 
-++ = |wea| = ,/q. Using the relations (6.1.11) and (6.1.13) we have 


2g 
Card X(F,x) = qe +1- Sof, 
i=1 
\Card X(F,) — ¢—1| < 294. (6.1.19) 


An elementary proof of the estimate (6.1.19) was found by Stepanov S.A. 
(1974) in [Step74] (cf. [Step84], [Step94], [Bom72]). 

The proof of the Weil conjectures is based on an idea from the theory of 
compact topological varieties. If F is a morphism acting on such a variety 
V then for the number v(F’) of fixed points of F (appropriately defined) the 
famous Lefschetz fixed—point formula holds: 


dim V 


VW(F)= S> (-1)*Tr Flay (6.1.20) 
1=0 


(the summands are the traces of the linear operators induced by F' on the 
cohomology groups H*(V)). In this situation one can define an analogue of (the 
logarithmic derivative of) the zeta function )>?-., v(F*)t*. It is not difficult 
to calculate the sum of this series. Let (a;;)j=1,... 5, be the characteristic roots 
of the linear operator H'(F’) = F|qicy) on H'(V) and 6; = dim H'(V) the 
Betti numbers; then 


bi dim V b5 
Tr Fly) = >. aby, U(F*) = Se (-1)' Soak, 
j=l i=1 j=l 


and hence 
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foe) dim V by co 
ae >I os) 
k k 


i=1 j= 


ox i : agzt 


dim V b; 


= (-1)' 7+ flog(1 — aijt)~}]. (6.1.21) 


i=1 j= 


The series Z(t) is determined by the conditions 7(0) = 1 and 


(=n 


Z(t)= [I (1 — at) 


i=1 \j=1 


We see that in this model situation the Z—function is rational and can 
be calculated very explicitly. Using this fact Weil proposed conjectures W1) 
— W4) and proved these conjectures for curves and Abelian varieties. For 
these varieties he introduced an analogue of the group H'(X) and proved a 
Lefschetz fixed—point formula of the same type as in the topological situation. 
The analogue of H'(X) is provided by the Tate module T;(Jx) of the Jacobian 
variety of the curve X (resp. of the given Abelian variety). In the general case 
analogues of the (topological) cohomology groups H* were constructed by 
Grothendieck (the étale cohomology groups H},(X,Q;)). In order to do this 
he modified the very notion of a topological space, which was replaced by a 
certain category (in the topological situation objects of this category are open 
sets, and morphisms are inclusions). The larger category used by Grothendieck 
was called the étale topology. 

The use of the groups H?,(X,Q;) and more generally cohomology groups 
of sheaves (in the étale topology) made it possible for Deligne to prove the 
conjectures of Weil (cf. [Del74], [Del80b] , [Kat76]). 


6.1.4 Zeta Functions and Exponential Sums 


([Kat76], [Kat88], [Sha69]). A traditional method for counting solutions of 
congruences or systems of congruences is related to exponential sums. For- 
mulae for the quantities 1, = Card X(F,:) can be obtained using Dirichlet 
characters x : Ff — C% (ie. multiplicative characters of Ff). Let ¢ denote 
the trivial character which is constant function 1 on the whole set Fy. If m 
divides q — 1 then one has the relation 
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Card{z € Fy | ec” =a} = S- x(a). (6.1.22) 


x =E 


Now consider the Gauss sum of a non-trivial character y, 


xeFy 


where Tr: Fy — F, is the trace and ¢ is a primitive p* root of unity. 

In the work of Hasse and Davenport of 1934 an interesting relation was 
found between exponential sums over finite fields and zeros of zeta functions 
(cf. [DH34]). 


Example 6.8 (The zeta function of a hypersurface.). Let 
X :aolQ” +a Ty” +--+ +4,T"" =0 


be a hypersurface in P” over Fy, where ao, @1,...,@n € Fy andg=1 mod m. 
Then we have that 


P(t)-v" 
(1 —¢)(1—qt)---(1—q"~1t) 


where P(t) denotes the polynomial 


I] (: - (-1)"**xo(a5") + xn (ant) 


XOX1s Xn 


Z(X,t) = (6.1.23) 


and Xo, X1;'°' ;Xn run through all possible (n + 1)-tuples of Dirichlet charac- 
ters satisfying the conditions 


Xi FE, XE =E, KXOX1+ +++ Xn =E. 


The proof of formula (6.1.23) is based on counting the quantities 1 = 
Card X(F,:) using relations between Jacobi sums and Gauss sums, and the 
Davenport—Hasse relations: for a non - trivial character y of F, let us define 
the character y’ = yoN of the field F,., where N : Fj. — F, is the norm map. 
Then the following relation holds: 


—9(x’) = (-9(0))!. (6.1.24) 


Relation (6.1.24) makes it possible to find explicitly the numbers a; and 3; 
such that for 1 > 1 one has: 


Card X(Fy) = 5-6) - Sal, 
j 4 


and (6.1.23) is then implied by (6.1.13). 
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Classically estimates for exponential sums were used to obtain estimates 
for the number of geometric points of varieties over finite fields. Conversely, 
one can effectively use the estimate W4) of the Weil conjectures to study 
exponential sums of a rather general type. We give only a simple example 
([Sha69], p.87). 

Let f(T) € F,[T], 0 < deg f = m < p and ¢? = 1,¢ # 1. Then the 
following estimate holds: 


p-1 
| S> | < mV. (6.1.25) 
«=0 
In order to prove (6.1.25) let us consider an auxiliary curve y? —y = f(x). 


Let us denote by X the curve obtained by a desingularization of its projective 
closure. Consider the projection XP! given by (2, y) > x. Then 


[[a-N =I I] @-No-)"?. (6.1.26) 


£eX veP T(§)=e 


The equality 7(€) = oo is satisfied for a single point € € X and the corre- 
sponding factor in the product (6.1.26) is equal to 1 — p~*. If t 4 co then the 
equation y? — y = f(x) is solvable in the field F, (ao) with € = (ao, yo) so that 
there are exactly p solutions yo, yo +1,...,yo+p—1 with norm N(zo). In this 
case the corresponding factor in the inner product is equal to (1—N(ax)~*)71. 
The solvability of the equation y? — y = a in F,(a) is equivalent to the condi- 
tion 


deg a-1 


Tre, (a)/F,(@) )=0, i.e. > a?" = 0, 


DVi@P =i re" )= SY fe), 


P(x)=0 


or 


where in the last sum x runs through all of the roots of the irreducible polyno- 


mial P associated with a closed point 1(£) € P \oo. Hence the inner product 
in (6.1.26) can be transformed into the following 


p-1 


[ey ne" 


r=0 


where 


x(P) = eu, A(P) = S- Ce, N(P) va pee 


Putting t = p~* we see that 


p-1 


2(X,s)=(1-t) [[ [[a-xy ny), 


P r=0 
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where P runs through all irreducible monic polynomials in F,[¢]. Extracting 
the factors with r = 0 we obtain: 


Z(X,t) = a= =a TI Ta-x FN(P)-*) 4. 


r=1. P 


For each monic polynomial G we put 


eae f(z), x(@)= CO. 


G(a)= 


The function y is multiplicative, so we obtain the equation 


L,(t) _ [[a = x(P)" pier) = XG Gyaoae 


P 


One verifies easily that L,(t) is a polynomial and deg L,(t) < deg f, and the 
coefficient of t is equal to 


xy es = cx (T-a) __ =a 


acF, acF, 


However, each of these sums is equal to a sum of some (inverse) roots of the 
function Z7(X,t). The number of these roots is equal to deg L,(t) < deg f, and 
the absolute value of each root is less than or equal to \/p, so that estimate 
(6.1.25) follows. 

Applications of cohomological techniques and of methods from representa- 
tion theory to the study of exponential sums of general type were considerably 
developed in the work of N.M.Katz [Kat76], [Kat88], [KL85], of Deligne and 
other mathematicians in the 70s and 80s (cf. [Del74] , [Del80b], [Bry86] ). 
In this research an exponential sum is interpreted as the trace of a certain 
operator (the Frobenius operator or the monodromy operator) acting on the 
space of global sections of a specially constructed sheaf on an algebraic vari- 
ety. Thus, the general exponential sums can be constructed using cohomology 
groups with compact support on an appropriate Artin—Schreier covering W 
of an affine variety V. Then estimates on the exponential sum can be re- 
duced to the Weil estimate W4) applied to a smooth compactification of W 
provided that this compactification exists (above we have considered an ex- 
ample for curves). In the general case it is not even known whether such 
compactifications exist. However this difficulty can be coped with using the 
technology developed in the second part of Deligne’s paper on the Weil con- 
jectures [Del80b], which contains a vast generalization of these conjectures. 
This generalization gives absolute values of the Frobenius elements acting on 
cohomology with compact support on general varieties, whereas the original 
formulation of the conjectures concerns essentially the constant sheaves on 
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smooth projective varieties. Impressive examples of the use of Deligne’s gen- 
eralizations of the Weil conjectures were given by [Kat88] in the interesting 
case of the multi-dimensional Kloosterman sums of type 


Oni 
Kl(p,n, a) = S- exp (FR (e1 +--+ 25)) . 


Byes Zn mod p 
©1+...°@y =a mod p 

by proving their important equidistribution properties with p fixed and vary- 
ing a. For an l-adic sheaf on an algebraic curve the equidistribution properties 
of the traces of local Frobenius elements were naturally formulated in terms 
of a certain algebraic group Geom Over Q, which is defined as the Zariski 
closure of the image of the geometric fundamental group in the corresponding 
l-adic representation. Under rather general assumptions on choice of an em- 
bedding of Q, into C one obtains a complex algebraic group Ggeom(C), and the 
Frobenius elements correspond to certain points in the space K® of conjugacy 
classes in a maximal compact subgroup K of Ggeom(C). The equidistribution 
property is to be understood in the sense of a measure p5 on K* obtained 
from the Haar measure on K. For the multi-dimensional Kloosterman sums 
this construction leads to groups 


Geom = Sp(n) for even n and arbitrary p, 
Ggeom = SL(n) for odd pn, 
Geom = SO(n) for p= 2 and odd n> 3,n 47, 


and Geom = Go(n) for p=2 and n=7. (6.1.27) 


These methods can be used to study of the equidistribution of the argu- 
ments of Gauss sums of the type 


= Gx) 1 


O(a) = va = Fy LL POX), |A(a)| = 1, 


xeFs 


where 7 is a non-trivial additive character of F,, x a generator of the cyclic 
group of multiplicative characters of Fy and 1 < a < q— 2. Katz also proved 
that for a fixed r > 1 the r-tuple of angles (@(a+1), 0(a+2), --- ,@(a+r)) € 
(S)" for 0 < a<q—2-—r becomes equidistributed with respect to the Haar 
measure on (S1)" as g — oo. An interesting related construction of the l-adic 
Fourier transform for sheaves on A” was suggested by Brylinski and Laumon 
(cf. [Bry86], [KL85]). 

These results are related to Sato—Tate Conjecture on the uniform distrib- 
ution of the arguments y, of Frobenius automorphisms in the segment [0, 7] 


2 
with respect to the measure — sin ydy (for cusp forms f without complex 
T 


multiplication) (cf. Chapter I in [Se68a], [Mich01] and §6.5.1). For recent de- 
velopments we refer to [KS99], [KS99a], [Mich01], [Sar98]. 


6.2 L-Functions, the Theory of Tate and Explicite 
Formulae 


6.2.1 L-Functions of Rational Galois Representations 


Let K be a number field, 27x its set of places (classes of normalized valuations) 
and 


p: Gr — GL(V) (6.2.1) 


a representation of the Galois group Gx = Gal(K/K) in a finite dimensional 
vector space V over a field F' of characteristic zero, which we shall usually 
assume to be embedded in C (in examples and applications we shall use F' = 
Q,C or Q). We call p unramified at a non—Archimedean place v if for all 
places w of K dividing v one has p(I‘”)) = {1y}, where I“) is the inertia 
group of w over v. In this case the representation p can be factorized through 
the quotient group 


Gm) /7™) = ees) = Gal((Ox/Pw)/(Ox«/Puv)), 


where G) is the decomposition group of w over v and Gry) the Galois 
group of the algebraic closure Oz/p, of the residue field k(v) = Ox /p,. The 
group Gx) is canonically generated by the Frobenius automorphism F'r,, 
Fr,(x) = aN’. Choosing a different place w above v will lead to the element 
p(Fr,) being replaced by a conjugate element. Hence the conjugacy class of 
the Frobenius element F),,, = p(£'r,) is well defined, and we write 


Py, p(t) = det(1y —t- F,,9) (6.2.2) 


for the characteristic polynomial of this element. Suppose that FE is a number 
field embedded in C. We call the representation p rational (resp. entire) over 
F if there exists a finite number of places S C X’x such that 


a) for all v € Xx%\S the representation p is unramified at v, 
b) the coefficients of P,,,(t) belong to EF (resp. to the maximal order Og of 


Let s be a complex number and v ¢ S. We have 


d 
Py,p(Nu~*) = det(ly — Nu~*F,,) = [[(1 — AiwNv~*), 


i=l 


where d = dim, V and ,,, are algebraic numbers viewed as complex numbers 
via a fixed embedding 7: QC. 
Let us define the [-function of a rational representation: 


6.2 L-Functions, the Theory of Tate and Explicite Formulae 273 


L(p,8) = [J P.p(No-*)7t. (6.2.3) 
vgs 


This is a formal Dirichlet series }>>° , ann~* with coefficients in EZ. In all 


known cases there exists a real constant c such that |\;,)| < (Nv)°, which 
implies the absolute convergence of the series (6.2.3) in the complex right half 
plane Re(s) > 1+ c¢. 

There are various methods of completing the product in (6.2.3) at places 
v € S. The purpose of such a completion is to obtain an L-function satisfying 
a certain nice functional equation. If v is a non—Archimedean place then one 
considers the subspace V(v) consisting of elements fixed by the inertia group 
I) for w above v. If p is ramified at v then V(v) 4 V (and possibly, V(v) = 
{0}. The conjugacy class of p(F'ry)|v(,) and its characteristic polynomial 


Py p(t) = det (1y(w) — thy olvw)) 
are then well defined, and the degree of the latter is smaller than d. Put 
Ly(p,8) = Py,p(No7*)~*. 


If v is an Archimedean point then a good definition of L,(p,s) depends on 
an additional structure (e.g. a Hodge structure) on the vector space V. In this 
case the v-factors can be expressed in terms of the following I’-factors: 


Ip(s) = 17*/*T'(s/2), ‘Te(s) = 2(2ni)-*T(s). (6.2.4) 


These factors play an important role in the study of the functional equa- 
tions satisfied by C-functions. If we put 


A(p, 8) = [[£©.»). (6.2.5) 


then in important examples it is possible to prove that the function (6.2.5) 
admits an analytic continuation onto the entire s—plane and satisfies a cer- 
tain functional equation relating A(p,s) and A(p’,k — s), where pY is the 
representation dual to p and & is a real number. 


Example 6.9. If v is a primitive Dirichlet character then in view of the Kro- 
necker — Weber theorem there is associated to y a one dimensional represen- 
tation py : Gg — C* such that 


L(py,8) = |] — x(p)p~*)* = L(s,x) = So x(n)n™. 


Pp 


Let 5 be zero or one so as to satisfy y(—1) = (—1)°. We then have that 


A(py, 8) = P(s + 4) L(py, 8) = €(8, x), 
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and the following functional equation holds (cf. Ch.3 of [Shi71] and [Wei67]): 


A(py,1—s)=1 GO) A(py, 8), (6.2.6) 


where C, is the conductor of x. and g(x) is the Gauss sum of x. The function 
€(s, x) is holomorphic on the entire s-plane for nontrivial characters x. If x is 
trivial then €(s) = 7~*/?I°(s/2)¢(s) has a simple pole at s = 1, and satisfies 
the functional equation 


&(1 — 8) = (8). (6.2.7) 


6.2.2 The Formalism of Artin 


The definition of the L-functions L(p,s) described above is due to E.Artin 
(cf. [Ar30]), who studied representations with a finite image Im(p). In this 
situation the reprepresentation p is always F—rational for some number field 
E, and is semi-simple (by Maschke’s theorem). Hence p is uniquely determined 
(upto equivalence) by its character yp (xp(g) := Tr p(g), g € Gx). If p is an 
arbitrary rational representation then the function L(p,s) is uniquely defined 
by its character x. This can easily be seen by taking the logarithmic derivative 
of the series (6.2.3) 


L! d ow 
Epis) _ _S~* SAM log(NvyNo-™ 
L(p, 8) v¢S i=l m=1 
Tels) log(Nv) 
= : 6.2.8 
3 Nyms : ( ) 


vym 


where F’", denotes the conjugacy class of the m*® power of the Frobenius 
element F',,,. In view of this fact one often uses the notation L(y», s) = L(p, s). 
If 
pi: GR —GL(V;) (¢=1,2) 


are two rational representations with characters x; = Tr p; then one can 
construct from them the further representations 


pP1 ® p2: Ge — GL(V, @ Va), 
pi ® po: Gr — GL(Y; ® V2). (6.2.9) 


whose characters are equal to x1 + v2 and x1 - x2 respectively, We have that 
L(xa + x2, 8) = L(x1, 8) L(x2, 8). (6.2.10) 


If K’'/K is a finite extension and p’ is a Galois representation of K’ with 
character x’, then one can define the induced representation p = Ind p’, whose 
character is given by the formula: 
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xg= >> x19) (6.2.11) 
YEG K/GRi 


where ¥y runs over a full set of representatives of left cosets, and it is assumed 
that x’ is extended by zero to the whole group Gx. 
In this notation the following equation holds 


L(p, 8) = L(p’, 8). (6.2.12) 


If K’'/K is a finite Galois extension then Gx is a normal subgroup of 
finite index in Gx. Then for any representation p of Gx rational over E’ we 
define its restriction to the subgroup Gx: : p1 = Res p, pi: Gx — GL(V). 
Then one has the following factorization formula due to Artin: 


L(p1, 8) = II L(p® py, 8)*°8%, (6.2.13) 
x€lrr G(K'/K) 


where the product is being taken over the set of characters y of all irreducible 
representations of the quotient group G(K'/K) = Gx/Gx, degx = x(1) 
(it is assumed that the field FE contains all values of characters y, which are 
certain sums of roots of unity). 

In the general case the representation p can always be replaced by its semi- 
simplification p, which has the same character. In order to define this consider 
the composition series 


V=VO>5VYM5D..-5V™ = {0}, 


of p-invariant subspaces with irreducible factors VO /V(4) (0 <i < m-—1). 
Then the representation p of Gx in the space 


V =VO/VO @VO/v® @.-.@ Vor-D yom) (6.2.14) 


is semi-simple, E-rational and has the same character as p. Furthermore it is 
uniquely determined (upto equivalence) by its character. 

The representation p : Gx —> GL(V) is called Abelian if Im p is an 
Abelian group. In this case we may consider p as a representation of the 
group G3? = Gx /G&, where G& is the commutator subgroup of Gx (i.e. the 
minimal closed subgroup containing all commutators), p : G3? —> GL(V). 

We have already seen in §5.3 certain examples of such representations (on 
the Tate module V;(£) of an elliptic curve with complex multiplication, and 
on the one dimensional Tate module V;(w) = Q;(1) of I-primarily roots of 
unity). 

If Im p is a finite group (not necessarily Abelian) then for a finite Galois 
extension K’/K one has Kerp = Gx and one uses the notation 


L(s,x, K'/K) = L(p,s), (6.2.15) 


where yx is the character of the corresponding representation 
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p: Gal(k'/K) — GL(V). 


The functions L(s,, K’/K) are usually called Artin L-functions; they can 
be reduced to products of Abelian L—functions of extensions of K using the 
formalism of Artin and the famous theorem of Brauer: if y is a character of 
a representation of a finite group G over C then there are cyclic subgroups 
G; C G and one dimensional characters y; of G; such that 


x=) ajIndg,x (a; € Z). (6.2.16) 


If G = G(K’/K) then G; = G(K'/K;) and it follows from the relations 
(6.2.10) that 


L(s,x, K'/K) = [] 26.11, 7). (6.2.17) 


a 


The analytic properties of Abelian L-functions are well known and they are 
analogous to the corresponding properties of the Dirichlet D-series: all of the 
functions L(s, yi, K’/K;) can be meromorphically continued onto the entire 
complex plane, possibly with a simple pole at s = 1 for trivial characters 
xi. This implies the existence of a meromorphic analytic continuation of ar- 
bitrary Artin L-functions. The famous Artin conjecture says that for a non- 
trivial irreducible character y of G = G(A’/K) the function L(s, x, K'/K) is 
always holomorphic. (Note that another famous Artin’s Conjecture (on prim- 
itive roots) was discussed in §1.1.4). 

However this conjecture seems to be very difficult in general, as is the 
generalized Riemann hypothesis which says that all zeroes of the function 
L(s, x, K'/K) lying in the strip 0 < Re(s) < 1 are actually on the line Re(s) = 
$. The difficulty with the Artin conjecture is related to the fact that one lacks 
the non-local definition of the Dirichlet series representing (6.2.17). In the 
Abelian case such a description follows from the fact that the coefficients of 
the Dirichlet series are “periodic” modulo a positive integer (or modulo an 
integral ideal of a number field). However in a number of interesting cases the 
Artin conjecture has been proved using the Mellin transforms of automorphic 
forms. A general global description of Artin L-series in terms of automorphic 
forms is given by the Langlands program (cf. §6.4, 6.5). 


6.2.3 Example: The Dedekind Zeta Function 


Let X = Spec Ox where Ox is the maximal order of K. The Dedekind zeta 
function of kc is the following Euler product 


els) = (X,9) = T] Nery, 
pCOn 


which is absolutely convergent for Re(s) > 1 and admits a meromorphic con- 
tinuation onto the entire complex plane. The continuation is holomorphic with 
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the exclusion of a simple pole at s = 1. The residue of ¢x(s) at s = 1 is known 
to be equal to (cf. [BS85], [Wei74a]) 


Rr 
WKv/ [Di] 


where hx is the class number of K, Rx is its regulator, Dx its discriminant, 
wk the number of roots of unity in K and r; (resp. rg) the number of real 
(resp. complex) places of kK. Therefore 


Res,—1C« (8) = hx2" (2m) (6.2.18) 


K@RER" OC”. 


From the point of view of L-functions the function ¢x(s) corresponds to the 
trivial Galois representation of the group G'x. It therefore follows from Artin’s 
factorization formula that 


Cx(s)=C(s) JT] = L(s,x, K/Q)**8*, (6.2.19) 


x€lrr G(K/Q) 


where the product is taken over all non-trivial irreducible representations of 
the group G(K/Q). If the extension K/Q is Abelian then (6.2.19) implies the 


following class number formula: 


ji I L(1,x), (6.2.20) 


271 (27)"2 Rig 


since Res ¢(s) = 1. It is not difficult to compute the values L(1, x): let Cy be 
the conductor of y. Then 


a) for y(—1) = —1 one has 


L(1,x) = aK ae kx(k (6.2.21) 
(k,Cy)=1 
0<keCy 
and 
1 
L(0,x) =-— ye, kx(k); in particular L(0,x3) =—= (6.2.22) 
Cy (k,Cy)=1 6 
0<k< Cx 


d 
for x3(d) = (5) (comp. with the functional equation given by the equal- 


ity (6.2.6), and see also [Hi93], p.66); the equality (6.2.22) is used in §7.2 
expressing the constant term of an Eisenstein series of weight one. 
b) for x(—1) = 1, x # € one has 
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L(1, x) = — S$) x(k) log |1—¢-*], (6.2.23) 
xX (k,Cy)=1 
0<k<Cy 


where ¢ = exp(277/C,,) is a primitive root of unity of degree C, and g(x) 
is the Gauss sum of y. 


Formulae (6.2.21) and (6.2.23) give essential information on the class num- 
ber, the regulator, and the structure of the class group Clx of an Abelian field 
K, in particular when K is quadratic or cyclotomic (cf. [BS85]). 

For a quadratic field of discriminant D = Dx > 0 we have 


hk =-—— 5 _ x(k)logsin(rk/D), (6.2.24) 
(k,D)=1 
O0<k<D/2 
where ¢€ is the fundamental unit of K with ¢ > 1. 
If D= Dr < —4 then 


1 = 
he =— Te SS” kx(k) = (2—x(2))' So x(k), (6.2.25) 
(k,D)=1 (k,D)=1 
O0<k<|D| O0<k<|D|/2 


and for the remaining fields K = Q(/—1), Q(/—3) one has hx = 1 (Dirich- 
let’s class number formula). 

We mention that there exists a purely arithmetical proof of the formula 
(6.2.25) in the case D#1 mod 8 found by B.A.Venkov (cf. [V81]). 

The number 


Rr 


wKvV|Dx|\ 


has a geometric interpretation as the volume of a fundamental domain for kK” 
in Jz with respect to the measure on the group Jk = {x € Jx | ||z|| = 1}, 
which comes from the normalized Haar measure 4* on the group JK (see 
§4.3). 


xK = Ress-1CK« (8) = he 2" (27)” 


6.2.4 Hecke Characters and the Theory of Tate 


({[La70], [CF67], [Wei74a]). The Abelian L-functions of a number field kK 
can be described using class field theory, which states in particular that 
there is a one-to-one correspondence between irreducible complex represen- 
tations of the group G%? and characters of finite order of the idele class 
group CK = Jx/K™*. In the classical theory these characters are known 
as “periodic” characters of the group of fractional ideals of K. For any in- 
tegral ideal m C Ox write S = S(m) for the finite set of places of K 
given by S = S(m) = {uv € YK | v divides m} U LY, where X¥ is the set 
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of Archimedean places of K. Let IS) be the free Abelian group generated by 
the finite places prime to m and let 


P(™ = {x € K | «=1(mod mO,) for v € S}. (6.2.26) 


Then for each one dimensional representation p : G3? — C% there exists 
an integral ideal m and a character y : J° — C% trivial on the subgroup 
(P‘™) of principal ideals of type (x), « € P/™ such that p(Fry) = x(pv)- 
The generalized Dirichlet L-series are then defined by 


L(s,x) = ][Q—x(p)Nps*) t= S> x(n)Nnos, (6.2.27) 


ves ni: n+m=Ox% 


where n runs through the integral ideals coprime to m. 

Hecke has introduced a new class of characters and D-functions, which, in 
principle, can not be reduced to L—functions of rational Galois representations. 
These characters are associated to arbitrary continuous homomorphisms 


Te as (6.2.28) 


and they can be described in classical terms as follows: there exists m C Ox 
and a homomorphism x : [° — C* such that for all  € P(™ one has 


ee) = T] (FS) tesa’, (6.2.20) 
v|oo - 
where 7, : K <— C™% is the complex embedding which defines v; ||, = 
|ryx|'2l the corresponding normalized norm; t, and o € R; a, € Z for 
Ky =C, ay =0or 1 for kK, = R. Since x((a#)) depends only on (2) the right 
hand side of (6.2.29) must equal 1 for all e € P(™ 9 Ex. The ideal m can be 
maximally chosen for the above condition; this ideal is called the conductor of 
x and is denoted by m = f(x). The above condition x((€)) = 1 imposes some 
restriction on the choice of numbers t,,a0 and a,. One verifies that these con- 
ditions define a subgroup of (Z/2Z)"! 6 Z"? 6 R™*"2 OR which is isomorphic 
to (Z/2Z)" @Z™2 GQutr-' OR. 

A correspondence between y and w is defined using the homomorphism 
which takes an idele to its divisor (cf. [CF67], Chapter 8) 


divs: JK G1), (ayy Lie’ = So eG) -v, (6.2.30) 


Uv 


and using an appropriate section 7 : I°5) — Jg which is defined by a choice 
of local uniformizing parameters ty € Ox (v(t) = 1 for v ¢ S): by this 
section a prime ideal p,, goes to the idele z(v) = (--- ,1,7,1,---) whose v- 
coordinate is 7, and whose other coordinates are 1. Then the character w 
corresponds to a unique y such that y(p,) = Y(a(v)), and this is a one-to-one 
correspondence. 
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Example 6.10. 1) If (x) = ||a|/° = ws(a), then m = Ox and 
x(Pv) = Nps, to =Ims, o=Res. (6.2.31) 


2) If & is an imaginary quadratic field, K C C, then for an arbitrary x there 
exists m such that 


eo) =F) be = ete (6.2.32) 


(for all x € K, x =1(mod m)). 
3) If K is a real quadratic field, K C R, then 


x((a)) = 22?! 8 |)° (6.2.33) 


for x = 1(mod m), 2,2’ > 0, p € Q, where ¢ is a fundamental unit in 
and z+> 2’ is the quadratic conjugation. 


The interpretation of Hecke characters as certain characters of the idele 
class group was given by C.Chevalley (cf. [Chev40]). 

Tate constructed in his thesis a general theory which makes it possible in 
particular to establish analytic continuation and a functional equation for all 
functions of type L(s,y). We describe briefly the key points of this theory, 
which is based on Fourier analysis in number fields ([La70], [Ta65]). 

Every continuous character 7 : Jx/K* — C* may be regarded as a 
function on Jx and it can be decomposed into a product ~() = [], Yu(#v), 
where w, : A — C* are quasicharacters (i.e. continuous homomorphisms to 
C*) such that for almost all v the quasicharacter 7, is unramified: 7,(O*) = 1 
and in view of the continuity one has q,(%y) = |u|2. The number o = Re 7, 
is called the real part of yy. 

The first stage in Tate’s theory is to obtain a representation for a local 
factor of the Hecke Z—-function 


L(s,x) = ]]Q—x()Npe)t= S x(Nn-8, (6.2.34) 


ves ni: n+m=OxK 


as a certain integral over the locally compact group K>‘ with respect to the 
Haar measure * normalized by the condition u*(O*) = 1. 

Let c: KX — C%* be an unramified quasicharacter. We use the decomposi- 
tions KY = Unez mMOe and Ov\{0} = Uneznso Orv in order to calculate 
the integral fo \ 19) e(@) dy (x). Consider first the integral Snmox c(x) dux (x) 
and put « = me with e € OX. Then 


= clalda Ge) =<) ts c(e) du ©) = e(my)” 


in view of the invariance of du (x) under multiplicative shifts: du (ax) = 
dy (2). 
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If Re c > 0 then it follows that c is integrable on O,\{0} = Unezn>0mT OX 
and one has 


/ x) du, (x 
On\(0) 


If c = Yyws then c(t) = x(Pv)Np, * in view of (6.2.29), and the expression 
(6.2.35) becomes the local factor of the L series (6.2.34). 

The essential fact is that the whole product (6.2.34) can also be inter- 
preted as a certain integral over a locally compact group with respect to an 
invariant Haar measure. The set O,\{0} in (6.2.35) refers more to an addi- 
tive theory than to a multiplicative theory. This combination of additive with 
multiplicative theories is a characteristic feature of Tate’s theory. 

Let G be a locally compact group with a Haar measure ju; denote by L1(G) 
the vector space of all integrable functions on G. If jz, is the Haar measure 
on the additive group K, normalized by the condition p,(O,) = 1 then a 
calculation analogous to (6.2.35) shows that 


= [on x) dy ( t) = S- c(t)” = (1— aero) Mee 


n>0 n>0 
(6.2.35) 


-1 fly (x) 


|x|. 


dyX(x) = (1—Np,") 


(6.2.36) 


in view of the multiplicative invariance of the measure djt,(x)/|2|y. In particu- 
lar, the condition f € L'(K‘) is equivalent to saying that f(x)/|z|, € L+(K,y). 

Let us introduce the following notation: for a quasicharacter c: Ks — C* 
and f € L'(K,) 


=f. f(x)e(x) dy (2) (6.2.37) 


and suppose that fc € L'(K%) for Re c > 0. Let f = 60,\40} be the charac- 
teristic function of the set O,\{0} and c an unramified quasicharacter. Then 
for Re(w,w,) > 0 one has the following expression for the local ¢-factor: 


Golf, Pows) = (1 — X(Pv)Np,*)~* 


In order to obtain a global analogue of this expression let us consider a 
function f(z) = [J], f.(vv) on the additive group of adeles Ax such that 
fr(x) € L'(K,) and fy = d0,\ 40}; for non—Archimedean places v ¢ S. For a 
quasicharacter c: JK /K* — C™ we shall set 


c= i f(x)c(x) du* (x) = If, f (av )e(ay) dx (xy). (6.2.38) 


Then the calculation (6.2.35) implies that 


C(f, pws) = L(s,x) [] (fo, bows), (6.2.39) 


ves 
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and one easily verifies that under our assumptions the integral and the product 
are absolutely convergent for Re(7,w,) > 1. An analytic continuation for the 
function L(s,x) is constructed using the integral representation (6.2.38) in 
which all auxiliary factors can be reduced (in our applications) to functions 
and to Gauss sums. 

The technique of analytic continuation is based on tools from the theory 
of additive Fourier transforms over the group Ax. The following are the key 
points: 


I) A choice of duality. Let us fix an additive character 
A: An/K —C*, Ax) =]]r.(a), 
where one usually puts: 


exp(—27izy) if kK, =R, 
Av(Ly) = ¢ exp(—477Re zy) if kK, =C, (6.2.40) 
exp(—27i{Trx,/9,2v}) if [Ky : Q)] < oo. 


Then the following isomorphisms of locally compact groups are defined: 
Ky = Ky AK ~ Ax, 


(where K,, Aj denote the corresponding groups of (continuous) charac- 
ters). These isomorphisms are constructed as follows: 


a (Xo: yr Alya)) (2, y € Ax), 


Ty > (Xa, yt Ay(yty)) (2, y € Ky). 


II) Self-dual measures. One chooses normalized measures ji, and fi = [],, flv 
such that for the Fourier transforms 


f= | fy)Aey) diy), fol) = | Fo(y)Av (toy) dito) 
Ak Ky 
(6.2.41) 
the inversion formulae can be written in the following form: 
f(-2) = f(@), fol») = fla). (6.2.42) 


provided f, € L'(K,), f € L1(Ax). 
If v is a non-Archimedean place then let 6, C O, denotes the local different, 


6,1 = {x € Ky | Ax(xy) € Z for all y € Oy}. (6.2.43) 


Then the self-dual measure is defined as follows: 
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N6, /? [by if v is non-Archimedean, 
fly = 4 dx (Lebesgue measure) if K, =R, (6.2.44) 
2dx dy ifz=a+iye Kk, =C. 


The important property of the self-dual measure is that f(A, /K) = 1; 
also one has 6, = dK O, and Nox = |Dr|. 

In concrete examples the following orthogonality relation is often used: for 
each character of a compact group G 


i. Ma) dp(a) = fe ees (6.2.45) 
G 


otherwise. 


This implies the following important formula: 
[ do(ou) dit) = 654 (2) er. 


III) The Poisson summation formula. Let f be a continuous function on Ax 
such that both |f| and |f| are summable over the subset K C Ax and 
the series \) cx f(« + a) converges uniformly on every compact subset 
of Ax. Then the following summation formula holds: 


S> fla) = SS fla). (6.2.46) 
ack ack 
Corollary. Under the above assumptions for alla € JK the following holds 


S> f(aa) = |lall-! S$ fla-ta). (6.2.47) 


ack ack 


Now we turn to main application of the summation formula: the proof of 
the functional equation for the ¢-functions. We assume that for all o > 1 
the function |f(x)| ||z||7 is integrable over the group of ideles Jx. Then for 
Re(ww,) > 1 the following integral is well defined: 


C(f, Yws) = : f(x) ws (x) du* (a). (6.2.48) 


Theorem 6.11. The function ¢(f, ws) admits an analytic continuation onto 
the entire complex plane and it satisfies the functional equation: 


C(f, Pus) = CF, ob twis). (6.2.49) 


To prove the theorem we decompose the integral into two parts: 


C(f,<) = i pon fete et a) + i fle)e(w) du* (x) (with ¢ = u,). 


le||<1 
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In view of the assumption on f the first integral converges for all c. Let us 
transform the second integral using the Poisson summation formula. We have 


/ f(x)c(x) du* (x) = / ( f (ta)c(ta) ini) dy, 
lll <1 teIK/Jk \IIk 


where dy‘ (a) is the measure on J which is compatible with the Haar measure 
dv(t) = |dt/t| on R% and the original measure dy* under the isomorphism 
Ix /Jk = RX. The inner integral on the right hand side transforms into the 
following sum 


f(ta)c(tx) du! (x ii, ( > f(txaje ite) du" (a) 


Jig ack~* 


where we denote by the same symbol dyu'(x) the measure on Ck induced by 
du'(x) on Jz. Now the inner sum transforms using (6.2.46) as follows: 


c(t) S> (tea) = elt) (= f (twa) - rx) 


acek~* ack 


ate) (ie ye ee )~ 10) 


ack 


! | 


I 


c(ta’) (ie do fete ta) + lea FO) - 10) 


ack~* 


We now change variables by putting u = t~',y = x71; the measure in Jz 
does not change. Putting the resulting expression into the integral, and using 
the notation c, = wic7! = w~tw4_., the integral over ||x|| < 1 becomes the 
following 


/ H(a)e(a) du*(e) = / Fla)er (0) du (2) 
|z||<1 \Ja||>1 


+ ff eta) (Ieat*7(0)  $(0)) dy ao, 
\t|<1 JCxK 
This proves the theorem. 


As an example of this general calculation let us deduce the classical func- 
tional equation for the Dedekind zeta function of a number field K. Set 
fv = 60, for non—Archimedean v, 


exp(—mx?) for Ky RR, 
f(x) = 2 s 
exp(—2z|z|*) for K, =C. 


Then one has 


6.2 L-Functions, the Theory of Tate and Explicite Formulae 285 
Co( fur 8) = Go(for8) =x, (s) (Kv = B,C) 
and we obtain the following expressions for the global integrals (¢—functions): 
C(f, ws) = Tp" (s)Le?(s)CK« (8), 
C(f,wi-s) = |Gx|O/? Tg (1 — 8) Te? (1 — 8)¢x(1— 8), 
which implies in view of (6.2.49) the following functional equation: 
Ax(s) = A«(1—s), (6.2.50) 


where 
Ax (s) = |Dic|*/? Tg (8) (8) Cc (s). 


In the general case of arbitrary quasicharacters we may and we shall as- 
sume that 5° v = 0 (by replacing s if necessary). Put 


mes? 


Ip(s + tty — |ay|) for kK, =R, 
Ic(s + it, — |a,|/2) for kK, =C, 


Dy = |Dx|Nf(x). 
Let gu(x) = 2.(xvAv) (em 7”) denote the Gauss sum, where {e} runs over a 


system of coset representatives for OX /(1 + f(x)O,y) with v(x) (f(x) > 0. 
Then the following functional equation holds: 


W(x)A(s, x) = AC — 5,¥), (6.2.51) 
in which 
A(s, x) = D3? [J Lo(s, x) - L(s,x), (6.2.52) 
vloo 


and W(x) is a complex constant with absolute value 1 given by 
W(x) =i” NF(X)? TT ao(x) [TT x60) 
vESy vESx 


where Sy = {v | py divides f(y)} and M = 97, |@ul- 


6.2.5 Explicit Formulae 


(cf. [La70], [Wei52b], [Wei72], [More77]). We already mentioned in chapter 1 
a link between the zeroes of the Riemann zeta function and the behaviour of 
the function 7(x) = >> p<x 1, p being prime numbers. This link is expressed by 
an explicit formula for the function }°7°, in(al/ *) in terms of the non trivial 
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zeroes of ¢(s), i.e. those zeroes in the critical strip 0 < Re s < 1 (the Riemann 
— Mangoldt formula), see Part I, §1.1.6. 

A generalization of this formula for Hecke L-series A(s, x) (see definition 
(6.2.52)) was proposed by Weil, and is based on the Weierstrass product ex- 
pansion of this function over its zeros. 

Let us assume that Re x = 0 and the normalizing condition es ty = 0 
is satisfied (see (6.2.51)). Put 6, = 1 if x = 1 and 0 otherwise. Then it follows 
from the functional equation (6.2.49) that the function 


[s(s — 1)]°x A(s, x) 


is an entire function of order 1. Hence by a general theorem from the theory of 
functions of one complex variable one has the following Weierstrass product 
expansion of this function over its zeros: 


A(s,x) = axe*[a(s— -*T] (1- =) er, (6.2.53) 


Ww 


where w = a+7y runs over the set of all zeroes of the function A(s, y) counting 
multiplicities, and a, and 6, are certain constants. The main result on explicit 
formulae for the Z—-functions A(s, x) can then be stated as an equation relating 
a linear combination of the values of a certain function over (logarithms of 
norms of) powers of prime ideals, to the sum of the Mellin transform of this 
function over the zeroes of the Z—function (6.2.53). 

Let us consider a complex valued function F' : R — C with the property 
that there exists a constant a > 0 such that 


F(ax)e(O/2)+ lel € L'(R). 
Then the Mellin transform 
+00 F 
P(s) = ‘| F(a)e%~ 2)” dx 


is a function which is holomorphic in the strip —a < Re s < 1+a. We assume 
that the function F(x) satisfies the following conditions: 


A) The function F(x) has continuous derivative everywhere apart from a 
finite number of points a;, at which both F(x) and F’(«) have only breaks 
of the first kind, and F(a;) = $[F (a; + 0) + F(a; — 0)]. 

B) For some number 6 > 0 the following estimates hold as |2| — oo: 


F(x) = O(e~G/2)+8)l21) 


F'(x) = O(e~(4/2)+®)l21), 
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Then we have that ®(s) = O(|t|~+) uniformly in the strip —a’ < 0 < 1+a’ 
if0<a’<b(o=Res,t=Im s). 
The explicit formula. With the above notation and assumptions consider 
the sum 
d, Pw) 
|t]<T 


extended over all zeroes w = 3+ %t of the function L(s, y) satisfying 0 < 6 < 1, 
\t| < T. Then as T — oo this tends to a limit, which is equal to 


oe * 
lim =6 ee: (e? +e7?) dx 
T- 00 
X, 
F (0) log Ay 
i NP log Nb” ee ee 
— DO Nprre Peo)" Flog Np") + x10)" Fog No”) 
-S WF.) (6.2.54) 
vioo 


where Ay = 27"!(27)~"? D.Nf(x), 
R@=H=Fa@e’!™ (n,=1K,2R); 


and W, is the functional uniquely defined by the property 


+00 
W.(g) = lim If (1 — el!) (x) g(a) dx — 2g(0) log AJ . 


Here the function K,(x) is defined by 


e- (1/2) ac 
Jex/2 — e-2/2 for K, =R (n, = 2), 
Ky (a2) = 

o-((1/2)-lavel) 


lero] 


for Ky = C (ny = 1). 


The explicit formula (6.2.54) is a rich source of possibilities for studying 
very fine points in the distribution of prime ideals in number fields and the 
images of the corresponding Frobenius elements in Abelian Galois represen- 
tations. We mention that there is an analogue of these formulae in the case 
of global fields of positive characteristic (the function field case). This gen- 
eralization makes it possible to obtain some very precise estimates for the 
numbers of points on curves over finite fields, and has some other interesting 
applications (cf. [Se83], [Step99], [TsV191]). 

The logical structure of the proof of (6.2.54) is quite simple and is based 
on studying the integral of the function & dlog A(s, x) along a special contour 
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A'(s, x) 
A(s, x) 


from the Euler product defining L(s,y) and from the Weierstrass product 
formula (6.2.53). One uses only the following results of an arithmetical nature: 


using two explicit expressions for the logarithmic derivative arising 


a) The boundedness of the |Z(s, x)| in each right half plane of type Re(s) > 
l+a,a>0. 

b) The functional equation (6.2.51) for A(s, x). 

c) The boundedness of the function A(s, x) in every strip 09 < Re(s) < 04 
excluding a finite number of poles. 


6.2.6 The Weil Group and its Representations 


(cf. [Ta79], [Wei74a]). We wish to discuss a general construction which makes 
it possible to treat at the same time representations of Galois groups and 
quasicharacters of number fields, and their L—-functions. This construction is 
based on the notion of the Weil group, which we briefly discuss now. 

Let F be a local or global field and F* its separable algebraic closure. For 
a finite extension E’/F in F’* let Gg = Gal(F*/E) be the corresponding open 
subgroup. Let 


Cra Jp/E*X if Eis a global field, 
ad) 2 if E is a local field. 


The relative Weil group Wz; can be described as a group extension 
0 > Cz > War — Gal(E/F) — 0, (6.2.55) 


whose isomorphism class is defined by the canonical generator ag/p of the 
cohomology group H?(Gal(E/F),Cz) = (ag/r) given by class field theory 
(see §4.4). 

There is also a more invariant definition which makes it possible to treat 
all extensions E'/F at the same time (cf. [Ta79]). 

The absolute Weil group Wr is defined as a topological group endowed 
with a continuous homomorphism y : Wr — Gr with dense image, which 
satisfies the following additional conditions ([Ta79], p. 74-75): 


W1) There exist isomorphisms rg : Ce — W2? for which We = y~! (Gz) for 
all finite extensions E and W2? = We/Ws, W% being the minimal closed 
subgroup of Wg containing all its commutators. These isomorphisms sat- 
isfy the following condition: the composition 


TE 
~ yg 
Cy > We > GP 


coincides with the homomorphism of class field theory. 
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W2) Let w € Wr and o = y(w) € Gr. Then for each E the following 
diagram commutes: 


TE 
Cp 72 wee 

isomorphism, | | conjugation 

induced by o by w 


TRO Bi 
Cro —> Wie 


W3) For E’ C E the following diagram commutes: 


TR 
Cr —? oe 
homomorphism, induced | | transfer, 
by the inclusion E’/CE see (4.4.18) 
ab 


Cg — > Web. 
W4) The natural map 


Wr — Wer ¥ Wr/Wy 


defines an isomorphism 


Wr = lim Wee. 
E 


It is not difficult to verify that this is equivalent to the previous definition. 

The group Wr can be constructed starting from the above relative Weil 
groups Wz,r using certain functorial properties of the classes ag/r. From 
the existence of Wr with properties W1) — W4) one can deduce all the main 
theorems of class field theory (in both local and global cases). 

Also, there exists a homomorphism w +> ||w|| of Wp to R¥ which corre- 
sponds under the isomorphism rp : Cr — W?? to the norm homomorphism 
to R% of the idele class group Cr = J;/F™* in the global case, and to the 
normalized absolute value of Cr = F™ in the local case. In view of the relation 
||Ne/ro|ly = |lo\|z the restriction of this norm function ||w|| on Wr to the 
subgroup Wg coincides with the corresponding norm function for Wz, so that 
we can omit the index FE. One verifies that the kernel of the homomorphism 
w +> ||w|| is compact. 


The relation between the local and global Weil groups. Let F' be a global 
field and v a place of F extended to F’. Then there exists a natural embedding 
6, : Wr, — Wr which is compatible with the inclusions 2, : Gr, — Gp and 
Ey 2 Cf for all E/F, [E: F] < w. 


Representations of the Weil groups. Denote by M(G) the set of isomor- 
phism classes of finite dimensional complex representations p : G — GL(V) 
of a topological group G. A one dimensional representation x : G — C* will 
be called a quasicharacter of G. Using the isomorphism rp : Cr — yee we 
can identify quasicharacters of Wr with quasicharacters of F (or of Cr). For 
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example, the quasicharacter corresponding to the quasicharacter c + ||c||¥ 
(with ||c||7 being the idele norm of c € Cr) will be denoted by the same 
symbol w,, so that one has w,(w) = ||w|*. 

On the other hand the image of y : Wr — Gr is dense, hence the set 
M(G pF) can be identified with a subset of M(W-r). Representations in this 
subset are called Galois-type representations. A representation p is of Galois 
type iff the image p(W-) is finite. 

Under the above identification a character x of Gp corresponds to the 
character of Cr obtained from x using class field theory. 

Using the embeddings @, : Wr, — Wer Weil has defined L-functions 
L(p,s) of representations p € M(W-f) which include the L-functions of Artin 
and Hecke as special cases (Hecke-Weil L—functions). For these L-functions 
the usual Artin formalism is valid. Also, there is an analogue of the theorem 
of Brauer, which makes it possible to reduce the Hecke-Weil [-functions to 
products of integral powers of Hecke [—functions of quasicharacters of finite 
extensions F of F’. A precise statement of the functional equation of L(p, s) 
and a definition of all its local factors are given in [Ta79]. 

Weil has established explicit formulae of the type (6.2.54), and proposed 
a generalized Riemann hypothesis, and an analogue of the Artin conjecture 
for the Hecke—Weil L-functions L(p,s). A remarkable fact is that both the 
Riemann and Artin conjectures can be reduced to positivity properties of 
a certain linear functional in the right hand side of the generalized explicit 
formula of type (6.2.54). 

Apart from complex representations one can also consider /-adic represen- 
tations of Wr, and compatible systems of such representations. Tate gives in 
[Ta79] general conjectures which indicate that complex and I-adic representa- 
tions of the Weil group play a universal role in number theory. 


6.2.7 Zeta Functions, L-Functions and Motives 


(cf. [Man68], [Del79]). As we have seen with the example of the Dedekind zeta 
function ¢x(s), the zeta function ¢(X, s) of an arithmetic scheme X can often 
be expressed in terms of Z—functions of certain Galois representations. This 
link seems to be universal in the following sense. 

Let X — Spec Ox be an arithmetic scheme over the maximal order Ox 
of a number field AK such that the generic fiber Xx = X ®o0, K is a smooth 
projective variety of dimension d, and let 


¢(X, 8) = []¢(X(p), 8) 
p 


be its zeta function, where X(p) = X @o0, (Ox/p) is the reduction of X 
modulo a maximal ideal p C Ox. The shape of the function ¢(X(p),s) is 
described by the Weil conjecture (W4). If we assume that all X(p) are smooth 
projective varieties over Ox /p = F, then we obtain the following expressions 
for ¢(X,s): 
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2d =“ 
¢(X,8) = ][ L(x, 8)?" (6.2.56) 
i=0 


where 
Li(X, 8) = II Pip (X, Ne) ty 
Pp 


and P;y(X,t) € Q{t] denote polynomials from the decomposition of the zeta 


function 
2d 


—s\(—1)'+1 
C(X(p),8) = T] Pip X. Np, 
i=0 
In order to prove the conjecture (W4) (“the Riemann Hypothesis over a finite 
field”), Deligne identified the functions L;(X,s) with the Z—functions of certain 
rational /—adic Galois representations 


pPxi: Gr — Aut Hi (Xx, Q:); Li(X,s) = L(px 3,8) 


defined by a natural action of the Galois group Gx on the [—adic cohomology 
groups H}.(Xz,Q,) using the transfer of structure 


XE =Xxak 
Spec K % Spec K (o € Aut R). 


If Xx is an algebraic curve then there are Gx—module isomorphisms 


He(Xx, Qi) = Vi(J) = Ti(X) ®z, Q 
(the Tate module of the Jacobian of X), 
He(Xx,Qi) =Q He Xx Q) = Vi(u) 


(Vi(u) = Tr(e) @z, Q; the Tate module of /—power roots of unity). This implies 
the following explicit expressions for the [—functions 


Lo(X, 8) = Cx(s), Lo(X,s) = ¢K(s—1), 
and the zeta function 


L1(X,8) = L(X,s) =][Pip(X,Np™*), 
p 


(where deg P;»(X,t) = 2g, g is the genus of the curve Xx) is often called the 
L-function of the curve X. 

For topological varieties cohomology classes can be represented using cy- 
cles (by Poincaré duality), or using cells if the variety is a CW-complex. 
Grothendieck has conjectured that an analogue of CW-decomposition must 
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exist for algebraic varieties over K. In view of this decomposition the factor- 
ization of the zeta function (6.2.56) should correspond to the decomposition 
of the variety into “generalized cells”, which are no longer algebraic varieties 
but motives, elements of a certain larger category Mx. This category is con- 
structed in several steps, starting from the category Vx of smooth projective 
varieties over KX. 


Step 1). One constructs first an additive category M{, in which Hom(M, N) 
are Q-linear vector spaces, and one constructs a contravariant functor H* 
from Vx to Mi, which is bijective on objects (i.e. with objects H*(X) 
one for each X € Ob(Vx)). This category is endowed with the following 
additional structures: 


a) a tensor product ® satisfying the standard commutativity, associativ- 
ity and distributivity constraints; 

b) the functor H* takes disjoint unions of varieties into direct sums and 
products into tensor products (by means of a natural transformation 
compatible with the commutativity and associativity). 


In this definition the group Hom(H*(X), H*(Y)) is defined as a certain 
group of classes of correspondences between X and Y. For a smooth pro- 
jective variety X over K denote by Z‘(X) the vector space over Q whose 
basis is the set of all irreducible closed subschemes of codimension i, and 
denote by Z%,(X) its quotient space modulo cohomological equivalence 
of cycles. Then in Grothendieck’s definition, for fields K of characteristic 
zero one puts 


Hom(H™*(Y), H*(X)) = ZB" (X x Y). 


Step 2. The category Mer,x of false effective motives. This is obtained from 
M', by formally adjoining the images of all projections (i.e. of idempo- 
tent morphisms). In this category every projection arises from a direct 
sum decomposition. Categories with a tensor product and with the latter 
property are called caroubien or pseudo—Abelian categories; Meg,« is the 
pseudo—Abelian envelope of M‘{,, cf. [Del79]. 

fe} 


Step 3. The category Mx of false motives. Next we adjoin to Meg, x all 
powers of the Tate object Q(1) = Hom(L, Q), where L = Q(—1) = H?(P') 
is the Lefschetz object and Hom denotes the internal Hom in Meg«. As 
a result we get the category Mx of “false motives”. The category Mx 
can be obtained by a universal construction which converts the functor 
M > M®Q(-1) = M(-1) into an invertible functor. Each object of Mx 
has the form M(n) with some M from Meg, x. 


Note that for X € Ob(Vx) the objects H*(X) are defined as the images 
of appropriate projections and 
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2d 
H*(X) = QB H"(X). 
1=0 


fe) 
The category Mx is a Qulinear rigid Abelian category with the commu- 
tativity rule 


Ww". H"(X)@ H4(Y) © H4(Y) @ H"(X),u@vH (-1)"v@ u, 


which implies that the rank rk(H(X)) = >¢(—1)" dim H"(X) could be 
negative (in fact it coincides with the Euler characteristic of X). 


Step 4. The category Mx of true motives is obtained from Mx by a modi- 
fication of the above commutativity constraint, in which the sign (—1)"® 
is dropped. This is a Q-linear Tannakian category, formed by direct sums 
of factors of the type M Cc H"(X)(m), see [Del79]. 


Tannakian categories are characterized by the property that every such 
category (endowed with a fiber functor) can be realized as the category 
of finite dimensional representations of some (pro—) algebraic group. 

In particular, the thus obtained category of motives can be regarded as the 
category of finite dimensional representations of a certain (pro—) algebraic 
group (the so-called motivic Galois group). 

Each standard cohomology theory H on Vx (a functor from Vx to an 
Abelian category with the Kiinneth formula and with some standard func- 
toriality properties) can be extended to the category Mx. This extension 
thus defines the 7{-realizations of motives. 


In order to construct [functions of motives one uses the following real- 


izations: 


a) The Betti realization Hg: for a field K embedded in C and X € Ob(Vx) 
the singular cohomology groups (vector spaces over Q) are defined 


H: X + H*(X(C),Q) = Ap(X). 
One has a Hodge decomposition of the complex vector spaces 
Hp(M)@®C = @He"(M) (hP4 = dimc H¥"(M)), 


so that H?4(M) = H#?(M). If K C R then the complex conjugation on 
X(C) defines a canonical involution F,, on Hp(M), which may be viewed 
as the Frobenius element at infinity. 

b) The l-adic realizations H,: if Char K 4 1, X € Ob(Vx) then the l-adic 
cohomology groups are defined as certain vector spaces over Q; 


H:X- HE(XK,Q) = W(X). 
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There is a natural action of the Galois group Gx on H)(X) by way of 
which one assigns an /—adic representation to a motive M € Mx 


pM.l:GrK— Aut Hi(M). 


A non-trivial fact is that these representations are rational for some 
E, EC C in the sense of 86.2.1. 


Using the general construction of 6.2.1 one defines the [—functions 


L(M,s) =] [L.(M,s) (v finite), 


where L,(M,s)~' = Ly,(M,Np;*)~' are certain polynomials in the variable 
t = Np, ° with coefficients in FE. 

For Archimedean places v one chooses a complex embedding 7, : K — 
C defining v. Then the factors L,(M,s) are constructed using the Hodge 
decomposition Hp(M) ®C = @H4(M) and the action of the involution Fy, 
(see the table in 5.3. of [Del79]). 

According to a general conjecture the product 


A(M,s) =|] L.(M,s) (v € Dx). 


admits an analytic (meromorphic) continuation to the entire complex plane 
and satisfies a certain (conjectural) functional equation of the form 


A(M, s) = e(M,s)A(MY,1-—s), 


where MY is the motive dual to M (its realizations are duals of those of M), 
and ¢(M,s) is a certain function of s which is a product of an exponential 
function and a constant. 

One has the following equation 


A(M(n),s) = A(M,s +n). 


A motive M is called pure of weight w if h?? = 0 for p+q #4 w. In 
this case we put Re(M) = ->: The Weil conjecture W4) (see section 6.1.3) 


implies that for a sufficiently large finite set S of places of K the corresponding 
Dirichlet series (and the Euler product) 


Ls(M,s) = [| L.(M,s) 
ves 


converges absolutely for Re(M/) + Re(s) > 1. 

For points s on the boundary of absolute convergence (i.e. for Re(M/) + 
Re(s) = 1 there is the following general conjecture (generalizing the theorem 
of Hadamard and de la Vallée—Poussin): 
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a) the function Ls(M,s) does not vanish for Re(M) + Re(s) = 1; 

b) the function Ls(M,s) is entire apart from the case when M has even 
weight —2n and contains as a summand the motive Q(n); in the last case 
there is a pole at s=1-—n. 


For example, for the motive Q(—1) one has 


Hp(Q(-1)) = #(P(C),Q), Ai(Q(-1)) = Vi(u) = Ti(u) Bz, Q, 


w = 2,n=-—1 and the L -function 


L(Q(-1), 8) = ¢x(s— 1) 


has a simple pole at s = 2. 

There are some very general conjectures on the existence of a correspon- 
dence between motives and compatible systems of [-adic representations. 
Nowadays these conjectures essentially determine key directions in arithmeti- 
cal research ([CRO1], [Tay02], [BoCa79], [Bor79], [Ta79]). We mention only a 
remarkable fact that in view of the proof of the theorem of G. Faltings (see 
§5.5) an Abelian variety is uniquely determined upto isogeny by the corre- 
sponding [—adic Galois representation on its Tate module. 

This important result is cruicial also in Wiles’ marvelous proof: in order 
to show that every semistable elliptic curve E over Q admits a modular para- 
metrisation (see §7.2), it is enough (due to Faltings) to check that for some 
prime p the L—function of the Galois representation p, z coinsides with the 
Mellin transform of a modular form of weight two (Wiles has used p = 3 
and p = 5). In other words, the generating series of such a representation, 
defined starting from the traces of Frobenius elements, is a modular form of 
weight two which is proved by counting all possible deformations of the Galois 
representation in question taken modulo p. 
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6.3.1 A Link Between Algebraic Varieties and L—Functions 


There is one more method of constructing L-functions, which is connected 
with modular forms (or more generally with automorphic forms). These forms 
may be regarded as certain special functions on real reductive groups G(R) 
(or on the symmetric spaces associated with them). These functions, which 
at first sight seem to be analytic rather than number theoretical objects, turn 
out to be closely related to a) Diophantine equations (arithmetic schemes, see 
§6.1), and b) Galois representations. A link between the three types of object 
is given by identifying the corresponding L-functions. A non-trivial example 
of the link between a) and b) is given by the proof of the theorem of Faltings: 
the L-function L;(A,s) attached to H'(A) of an Abelian variety A uniquely 
determines A upto isogeny. It is even sufficient to know a finite number of the 
local factors (see §5.5) 


L(A, s) = det(1 — Npj*Fr,|Ti(A))~* (lt Char v). 


Therefore the finiteness problems for Abelian varieties upto isogeny can be 
reread in terms of the corresponding Galois representations. 

A characteristic feature of the modern theory of L-functions is the study 
of automorphic forms together with the (infinite dimensional) representations 
of the groups G(R) and G(A) generated by these forms, where A denotes 
the adele ring of Q and it is assumed that G is a reductive algebraic group 
defined over Q. These representations (automorphic representations) occur in 
the corresponding regular representations, i.e. in vector spaces of smooth (or 
square integrable) functions over these groups (with respect to Haar measure). 
This approach makes it possible to study the Z-functions using methods from 
the representation theory of the groups G(Q,), G(R) and G(A) (cf. [Bor79] 
and §6.5). 


6.3.2 Classical modular forms 


are introduced as certain holomorphic functions on the upper half plane H = 
{z € C | Im z > 0}, which can be regarded as a homogeneous space for the 
group G(R) = GL2(R): 


H = GLa(R)/O(2) - Z, (6.3.1) 


where Z = {(§ ) |x € R*} is the center of G(R) and O(2) is the orthogonal 


group. The group GL (R) of matrices y = (2 tr) with positive determinant 
i, eae 3 
acts on H by fractional linear transformations; on cosets (6.3.1) this action 


transforms into the natural action by group shifts. 
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Let I’ be a subgroup of finite index in the modular group SL2(Z). A 
holomorphic function f : H — C is called a modular form of (integral) weight 
k with respect to I’ iff the conditions a) and b) are satisfied: 


a) Automorphy condition 
f((a,z + b,)/(c,z + dy)) = (eyz + dy)* f(z) (6.3.2) 


for all elements y € I; 
Regularity at cusps: f is regular at cusps z € QU ico (the cusps can 
be viewed as fixed points of parabolic elements of I’); this means that 


for each element o = G ") € SL2(Z) the function (cz + d)~*f (2244) 


io” 
wa 


admits a Fourier expansion over non-negative powers of g'/N = e(z/N) 
for a natural number N. One writes traditionally 


q = e(z) = exp(277iz). 


A modular form 3 


f(z) = do a(n)e(nz/N) 
n=0 
is called a cusp form if f vanishes at all cusps (i.e. if the above Fourier ex- 
pansion contains only positive powers of q'/"), see [La76], [Mi89], [Ogg65], 
[Fom77], [Pan81] . 


The complex vector space of all modular (resp. cusp) forms of weight k 
with respect to I’ is denoted by M;,(I’) (resp. S,(I)). 

A basic fact from the theory of modular forms is that the spaces of modular 
forms are finite dimensional. Also, one has M;,(I)M)(L) C Myii(L). The 
direct sum 


M(D) = Der) 
k=0 


turns out to be a graded algebra over C with a finite number of generators. 
An example of a modular form with respect to SL2(Z) of weight k > 4 is 
given by the Eisenstein series 


Ge(2) = S> (m + ma2)* (6.3.3) 


mi1,m2EZ 


(prime denoting (m1, m2) # (0,0)). For these series the automorphy condition 
(6.3.2) can be deduced straight from the definition. One has G;(z) = 0 for 
odd k and 


298 6 Zeta Functions and Modular Forms 


Fig. 6.1. The group SL(2, Z). 
Graphics in Figure 6.1 is contributed by Curtis T. McMullen. It represents the action 
of the group SL(2, Z) on H. 


2(2Qri)* 
(k—1)! 


Gz (z) = 


Bo 
ah Yolen) ; (6.3.4) 


where o,_1(n) = oan d*-1 and B, is the k* Bernoulli number. 

The graded algebra M(SL2(Z)) is isomorphic to the polynomial ring of the 
(independent) variables G4, and Gs. 

The set H/SL2(Z) can be identified with the set of isomorphism classes 
of elliptic curves over C: to each z € H one associates the complex torus 
C/(Z + 2Z) which is analytically isomorphic to the Riemann surface of the 
elliptic curve written in Weierstrass form as follows 


y? = 4a? — go(z)x — g3(z) (6.3.5) 


where go = 60G4(z), g3(z) = 140G6(z). 
If we replace z by y(z) for y = (a: o) € SL2(Z) then the lattice A, = 
Z+ 2Z is replaced by 
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Ay(z) = Z+(2)Z = (cz + d)7'(Z4 2Z) = (cz +d)" A,, 


and the curve (6.3.5) is then replaced by the curve whose Weierstrass form 
has the coefficients 


g2(V(z)) = (cz + d)*92(z),  93(y(z)) = (cz + d)°ga(z). 


The discriminant of the cubic polynomial in the right hand side of (6.3.5) 
is a cusp form of weight 12 with respect to IT = SLo(Z): 


2-4(g9 — 2742) = (6.3.6) 
4 (2m) !e( 2) [I 1 — e(mz)) Sir Qn) SOM 
m=1 n=1 
where 7(n) is the Ramanujan function. The function 
j(z) = 18 = + 744+ Yel nz) (6.3.7) 


is meromorphic on H and at oo, and is invariant under [ = SL2(Z). This 
function provides an important example of a modular function; it is called the 
modular invariant. 


6.3.3 Application: Tate Curve and Semistable Elliptic Curves 


(see [Ta74], [He97], p.343, [Se72], p.276.). Let us use the modified Weierstrass 
equation of §5.3.3 in the following form (with coefficients in Z|[q]]): 


Tate(q) : y? + ay = 2° + B(q)z + C(Q), (6.3.8) 


(4) (=) 
240 —504 Len (1 neg” 
O@)= 12 = => e 


This equation defines an elliptic curve over the ring Z((q)) with the canonical 


differential wean given by 
dx _ dx 


Qy+a YY’ 


the variables x,y defined via the substitution 


1 
Veeck veg) 
egg, ae ar 


as in §5.3.3. 
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Next let us take any p-adic number g € Q> such that |q|p < 1. 
Then for any t € QF /(q) let us consider the following series 


ahr t 
p(t) = x Ta (6.3.10) 


and one obtains a Tate curve E, over Q, with the equation (6.3.8) in which 


ee) nq” 
Bq) =-5)- tea Z (6.3.11) 
n=1 
1 Q (75 + 5n3)q” 
C(q) = Z 
()=—7 a1 q" p 


Theorem 6.12 (Tate). There is a Q,-analytic isomorphism 


Qe /(q) > Ey(Q), tH (x), ¥@) 


where 


gt ng” 
\=)5 25 
a(t) (sgt)? Lag 


n=1 


(qt)? ng” 
y(t) = ys (1 = q’t)3 : > [= grt 


n=1 


Moreover for any semistable elliptic curve E over Q,y there exists a p-adic 
number q € QF such that |q|p < 1, and E is isomorphic to Eq (over an 


unramified quadratic extention Q,(/£4(q)). 


Let N > 1 be a natural number. Let us define 
Tate(q’) : y? + ay = 23 + B(q™)a + C(q’). 


Next we put t = exp(27iu), then the points of order N on Tate(q’) corre- 
spond tot = ¢yq’, (0 < i,j < N—-1), ¢w = exp(2mi/N), and their coordonates 
are given by 


Nn Oe 
qi"t nq 
t)= 2 
x( ) >» el = gN"t)? 2d _ qt 
Wry? OO Nn 


y(t) = > st gN"t)3 | X 1 Suey 


neZ 


It is important for arithmetical applications for the Tate curve that these 
coordonates belong to the ring Z[¢v, N~*I[[q]]. 
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Proof 


uses the identity 


Ona) e 
So (ut n)-* = a ys nk leerinu (k>2, weZ). 
“n=1 


neZ ( 


We have for the lattice A = 27i(Z + TZ) the following equalities 


X = 9(27iu) = 
(Qni)~? | u-2 + ee n)~? —(mr+n)~) | = (6.3.12) 
m,neZ 
(27i)~ een a2 Sane *—2¢)) = 
meZneZ m=1neZ 
S- > ne2™utmr)n a) S- S- ne2timnt +4 BD 
meZn=1 m=l1n=1 12° 


implying the above identities. 


6.3.4 Analytic families of elliptic curves and congruence subgroups 


For an integer N the following congruence subgroups are defined: 


Io(N) = {7 € SLa(Z) | ce, =0 mod N}, (6.3.13) 
I\(N) = {7 € Ip(N) | a, = ad, =1 mod N}, (6.3.14) 
I(N) = {7 € SL2(Z) | y=1 mod N}. (6.3.15) 


More generally, a subgroup I’ C SL2(Z) is called a congruence subgroup 
iff ! > F(N) for some N. Consider fundamental domains in H for the actions 
of the above congruence subgroups: (a) H/Io(N), (b) H/I(N), (c) H/r(N). 
These domains can be identified respectively with the sets of isomorphism 
classes over C of the following objects: 


(a) (E,(P)), an elliptic curve over C together with a cyclic subgroup of order 
N, (P) c E(C), Card(P) = N; 


(b) (E, P), an elliptic curve over C together with a point P € E(C) of order 
N, Card(P) = N; 


> 
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(c) (EZ, 6) = (E, P, Q), an elliptic curve over C together with an isomorphism 
@:Z/NZ@Z/NZ s E(C)n, 


such that 
den = exp(27idet/N), 


where ey is the Weil pairing (5.3.28) thus ¢ gives a basis of the points of 
order N: 


P,Q € E(C)n = (P) 6 (Q) = Z/NZS& Z/NZ. 


In order to describe this identification one associates to a point z € H the 
following objects: 


(a) (C/Az, (1/N) mod A,)); 
(b) (C/A,,1/N mod A,); 
(c) (C/A,,1/N mod A,,z/N mod A,), 


1/N mod A, & (1,0) mod N, z/N mod A, + (0,1) mod N 


6.3.5 Modular forms for congruence subgroups 


For the study of modular forms it is convenient to use the traditional notation 
(flay)(z) = dety*/? f(y(2)) (yz + dy) * (6.3.16) 


for the weight k action of an element y = ca ) € GL}(R) with positive 


determinant dety > 0. 
Let w be a Dirichlet character mod N. Put 


Mi (Ni) ={f EMKUM(N)) | flay = o(dy)f for all y € To(N)}, 
(6.3.17) 
Sx (N,v) = Me(N,Y) 1ST (N)). (6.3.18) 


One then has the following decomposition 


MEN) = EG Me(N,d), STN) = GBD Se(N,v). 


w mod N w mod N 


For a modular form h € M;(N,w) the Petersson inner product of h with 
f € Sx(N, v) is defined by the formula 


(f,h)n = | f(z)h(z)y*-? dex dy, (6.3.19) 
H/Io(N) 
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where z = «+ iy, H/Io(N) is a fixed fundamental domain for H modulo 
Io(N). Then one has the following orthogonal decomposition: 


where €,(N,w) is the subspace of Eisenstein series, whose basis can be ex- 
plicitly described and consists of Fourier series of type (6.3.3) ([La76], [He59], 
[Shi71]), [Mi89]. 

The arithmetical significance of modular forms is well illustrated by the 
example of the theta series. Let 


Q(X) = 5'XAX 


be a positive definite, integral quadratic form in an even number 2k of vari- 
ables, with an even matrix A = (aj), aij € Z, ay € 2Z, k > 2, where 
X ='(21,%9,...,22) is an integral column vector. Let us associate to Q(X) 
the following theta series 


A2Q)= So e(Q(M)z) = S-a(nje(nz), (6.3.21) 
MeZ2k n=0 
where 
(m1, ma,.. .; Mok) = M, 
a(n) = a(n; Q) = Card{ M € Z?* | Q(M) =n} (6.3.22) 


is the number of representations of n by the integral quadratic form Q with 
matrix A. 

Let N be the level of Q, i.e. the smallest positive integer N such that N.A~1 
is an even integral quadratic form. It turns out that 6(z;Q) © Mz(N,<¢a), 
where €,(d) = (4) is the quadratic character attached to the discriminant 
A of the form Q, cf. [Ogg65], [Kog71]. 

The theory of modular forms makes it possible to obtain good estimates, 
and sometimes even explicit expressions, for the numbers a(n; Q). In order to 
do this the theta function (6.3.21) is represented as a sum of an Eisenstein 
series 


Ex (z;Q) = S— pr(n; Q)e(nz) € E(N, ea) (6.3.23) 


n=0 


and a cusp form 


Sp(z;Q) = S- be(n; Q)e(nz) € Se(N, ea). (6.3.24) 


n=1 
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The coefficients pz(n;Q) are elementary arithmetical functions such as 
orx-1(n). For the coefficients of cusp forms one has the famous estimate 


b(n) =O (n*F'+°) , e>0 (6.3.25) 


which was known as the Petersson—Ramanujan conjecture before being proved 
by Deligne (cf. [Del74]), who first reduced this conjecture to the Weil con- 
jecture (see §6.1.3) in [Del68], and then proved all these conjectures, using 
Grothendieck’s étale [-adic cohomology. 

In particular for the Ramanujan function r(n) Deligne’s estimate takes 
the form 


11/2 


T(p) < 2p (p prime numbers). 


Applying the estimate (6.3.25) to the series (6.3.21) gives 
k-1 
a(n; Q) = pr(n;Q) +O (n'z*+°) , e>0 (6.3.26) 


The proof of (6.3.25) is based on a geometric interpretation of a cusp form 
f € S(L) of even weight & as a multiple differential: for ~ € I’ the expression 


f(z) (dz)*? 


does not change if we replace z by y(z), and can therefore be defined over the 
modular curve Xp = H/T, ie. on a projective algebraic curve, whose Rie- 
mann surface is compact and is obtained by adding to H/T a finite number of 
cusps. In particular, for k = 2 the expression f(z) dz represents a holomorphic 
differential on Xp, and the estimate (6.3.25) in this case was first established 
by [Eich54] (the congruence relation of Eichler-Shimura, see a detailed expo- 
sition in [Shi71] and in Réhrlich’s papers in [CSS95]). 


Many interesting examples of formulae for the numbers a(n;Q) can be 
found in the book of L.A.Kogan, [Kog71], and in [An65], [Lom78], [He59] and 
elsewhere. 


6.3.6 Hecke Theory 


In the examples of Eisenstein series and theta functions one notices an in- 
teresting fact: the Fourier coefficients a(n) of these modular forms turn out 
to be either multiplicative arithmetical functions or linear combinations of 
such functions. For the Ramanujan function r(n) (6.3.6) these multiplicativ- 
ity properties have the following form 


T(mn) = r(n)r(m) for (m,n) = 1, 


T(p") = T(p)t(p"—") — p''r(p"~?) (p a prime number, r > 2) (6.3.27) 
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and it seems that not even these relations can be established using only ele- 
mentary methods. They might provide an example for the theorem of Gédel 
(see Chapter 3) (cf. [He59], [An74], [La76]). 


Let m be a positive integer, f(z) = S>>~_, a(n)e(nz) a function on H. Then 
the following functions are defined 


FIO (n)(2) = Yo almmne(nz) = m¥?? se (GY. 
n=0 wu mod m 


FIV on)(2) = Yo alne(mne) = flme)=m-¥2 Fa (OC). (6.3.28) 


n=0 


Imagine that the operator 


fro f|U(m) 


acts on the space of modular forms M;(N,w). Then one might expect to find 
a basis consisting of its eigenfunctions. Assuming f to be an eigenfunction 
would then imply the relations 


a(mn) = A(m)a(n) (n EN) 
where \(m) are the corresponding eigenvalues: 


f|U(m) = Am) f. 


The desired multiplicativity property would then follow. However, if f © 
My, (N, vw) then in general one can only state that 


f|U(m)(2), FIV (m)(z) € Ma (mN, ), (6.3.29) 


and 
f|U(m)(z) € MeN, ¥) 


holds only when m divides N. In order to overcome with this difficulty 
in the general case note that the matrices ie “) in the definition (6.3.28) 
of U(m) form part of a complete system of right coset representatives for 


Io(N)\Am(N), where A,,(N) denotes the set 
b 
An(N) = \7= be | a,b,c,d € Z,c =0 mod N,dety =m}, 


which is invariant under right multiplication by elements of Ip(N). As a com- 
plete system of right coset representatives for Ig(N)\Am(NV) one could take 
the set 


b 
{(64) | d>0,ad=m,b=0,...,d~1}. (6.3.30) 
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This fact makes it possible to define instead of U(m) another operator 
which does act on the space of modular forms M;,(N,~). This other operator 
is called the Hecke operator T(m): 


fo fleT(m) = m*PTS" b(ae) fla; (6.3.31) 


where o = (2 a) € Ip(N)\Am(N), (m, N) = 1. 
The action of T(m) on the Fourier coefficients is easily calculated using 
the systems of representatives (6.3.30): 


fleE(m) = SF bm) my? f|U(m/m1)V (m1) 


myi|m 


+ S- s w&(m)m*k-ta(mn/m?)e(nz), (6.3.32) 


n=1 mi|(m,n) 


where it is implicitly assumed that a(x) = 0 for x ¢ Z. 
Multiplying the systems (6.3.30) together shows that the multiplication 
rule for the operators T),(m) is as follows: 


T(m)Tk(n) =  S> h(mi)mf~!T(mn/m). (6.3.33) 


In particular all the operators T;,(n) commute. If f € M,(N,w) is an eigen- 
function for all T;,(m) with (m, N) = 1, ie. if 


f|Tk(m) = An(m)f ((m,N) = 1), (6.3.34) 
then (6.3.33) implies that 
Ag(m)Az(n) = SY h(a) mo" Ag(mn/m’). 
mil|(m,n) 


Equating the Fourier coefficients a(n) in (6.3.34) one obtains the following 
equalities 


a(0) $9 d(mi)my* = Az(m)a(0), 


milm 


» w(mi)mt—ta(mn/m?) = A¢(m)a(n). (6.3.35) 


mil|(m,n) 


In particular for n = 1 one has 
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a(m) = A-(m)a(1), (6.3.36) 

and for a(1) # 0 the function a(m) is therefore proportional to the function 
A(m) for (m, N) =1. 


All these properties can be especially neatly expressed in terms of Dirichlet 
series: for the function 


Co 


f(2) = ¥alnje(nz) € Me(N,¥), 


n=0 


satisfying (6.3.34) put formally 


In(s,f)= SY) As(n)n™*, Ry(s,f)= SY) a(n)n-*. (6.3.37) 
inet inet 


Then these (formal) Dirichlet series satisfy the following identities: 


I) The Euler product expansion: 


En(s,f)= J] D-As@)p* t+ ep]. (6.3.38) 


Tl) Ry(s, f) = a) Ln(s, f). 


Indeed the multiplication rule (6.3.33) for distinct primes p;, p; { N implies 


that one has Re 
Ly(s,f)= [I (>: stabi 


p:p{N \d=0 
and each of the series can be summed over 6 using the relation 
dg (P)As(P°) = Af (P"**) + Y(V)AZ(P?™) (6 = D- (6.3.39) 


Equation IT) follows then directly from (6.3.36). 
Convergence of the series (6.3.37) for Re(s) >> 0 follows from the following 
estimates for the coefficients 


a) If f € Mz (N,w) then 
|a(n)| = O(n*-1**), e>0 (6.3.40) 


and the Dirichlet series (6.3.37) converges absolutely for Re s > k. 
b) If f € S& (N,v) then 


la(n)| = O(n’= **), e>0 (6.3.41) 


a R41 
and the series Ly(s, f) and Ry(s, f) converge absolutely for Re s > Gry) 
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The estimates (6.3.40) and especially (6.3.41) use some fine arithmetical 
properties of the coefficients a(n). However, using only analytic properties of 
f(z) (the fact that it is holomorphic and the automorphic condition (6.3.2)) 
one can easily obtain rougher estimates: 


a) 
Ja(n)| = O(n"), for f € Mk(N, ¥); 
b) 
la(n)| = O(n*/?), for f € Se(N,¥); 
the latter estimate is implied by the estimate |f(z)| = O(y*/*) (y — 0, 
z=a+ity). 


A basis consisting of eigenfunctions for Hecke operators can be found using 
the Petersson inner product (6.3.19). One verifies that the operators T,(m) 
on S,(N, ww) are normal with respect to this inner product for (m,N) = 1. 
Moreover, the operators are ~—Hermitian: for all f,h € S,(N, w) the following 
equation holds: 


v(m) (F|Tk(m), hw = (Ff, AIT (m)) wv. (6.3.42) 


By a general theorem of linear algebra on families of commuting nor- 
mal operators, there is an orthogonal basis of S;,(N, q) consisting of common 
eigenfunctions of all the T,(m), ((m, N) = 1). A basis with this property is 
called a Hecke basis. In the case that the number m is divisible only by prime 
divisors of the level N, one may use the operator U(m) instead of T;,(m). 
As was mentioned above (see (6.3.28)) these operators leave M;,(N, a) invari- 
ant. However, they are not normal and in general are not diagonalizable in 


SEN, v) 2 
The Mellin transform of a modular form and its analytic continuation. Let 


Co 


f(z) = So a(n)e(nz) € Mu(N,¥). 


n=0 


Then the Dirichlet series 
R(s, f) = Ra(s, f) = So a(n)n~ 
n=1 


which converges absolutely for Re s >> 0 can be analytically continued to the 
entire complex plane using the Mellin transform of the modular form f: 


(2n)-*P(s)R(s, f) = | ~LFéy) — a(O)ly*-tdy (Re(s) > 0). (6.3.43) 


This can be seen by integrating termwise the series 
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Co 
f (iy) — a(0) = S¢ a(n) exp(—2rny) 
n=1 


and using the integral representation of the gamma function: 


I'(s)= de e Yys—* dy (Re(s) > 0). 


The vector space of all Dirichlet series of type R(s, f) for f € Mz(N, wv) 
can be characterized by analytic properties of these series. Following [An74], 
we give this characterization in the case N = 1. 


Theorem A. Let f © My = My(SL2(Z)). Then the Dirichlet series R(s, f) 
admits a meromorphic continuation to the entire complex plane, and if one 
puts 


A(s, f) = (2) °I'(s)R(s, f), 


then the function 


a(0) , -Da(0) 


A(s, f) + (6.3.44) 

8 k—s 

is entire. The following functional equation holds 
A(k — s, f) = (-1)*/7A(s, f). (6.3.45) 


Theorem B. Every Dirichlet series R(s) = ~~, a(n)n~* whose coefficients 
a(n) have not more than a polynomial order of growth, and which satisfies 
(6.3.44) and (6.3.45), has the form R(s, f) for some modular form f € My = 
Mi (SL2(Z)). 

Indeed, in order to prove theorem A we use the Mellin transform (6.3.43) 
and write 


AG) / * LFéy) — a(0)]y°! dy (Re(s) > +1). 


Taking into account that f(—1/z) = z* f(z) we see that 


A(s, f) =f rew) 7 a(0)]y*—* dy = a) ate 7 fliy)ys" dy 
=f a ~ i f(-1/iy)y** dy 

= a ika 
=/ [f (ty) — a(0)](ye* + iy* 8") dy — “ ~ i a) 


The function f(iy) — a(0) tends to zero exponentially as y — oo, so the last 
integral converges absolutely for all s and turns out to be a holomorphic 
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function of the variable s. This proves (6.3.44). By the substitution s 
k — s the last written expression for A(s, f) is multiplied by i*, and both the 
functional equation (6.3.45) and theorem A follow. 

In order to prove theorem B it suffices to use the inverse Mellin transform, 
and the fact that the whole modular group SL2(Z) is generated by the matrices 
(33) and (° 1), 

Theorems A and B can be extended to modular forms of integral weight for 
congruence subgroups of SL2(Z) (with natural technical complications). The 
theorem generalizing Theorem A in this situation is called the direct theorem, 
and the generalization of Theorem B is called the inverse theorem (or converse 
theorem, cf. [CoPSh02]). This inverse theorem was stated by [Wei67], [Wei71] 
in terms of the twisted Dirichlet series 


A*(s,x) = (20) ~*D(s) $> x(n)a(n)n-*, (6.3.46) 


n=1 


where y is an arbitrary Dirichlet character. Assume that the series 
Co 
R(s) = Ss a(n)n~* 
n=1 


converges absolutely for s = k—6, 6 > 0, and that for ally mod r, (r,N) =1 
the functions (6.3.46) are entire, bounded in every vertical strip and satisfy 
certain functional equations relating A*(s,\y) to A*(k — s,x). Then one can 
deduce that the Fourier series 


f(z) =D) a(n)e(nz) 


n=0 


represents a modular form in S;(N, a). In other words, the automorphy prop- 
erties of this Fourier series can be deduced from functional equations of the 
corresponding Dirichlet series twisted by Dirichlet characters; the precise form 
of the functional equations for these series is given in [Wei67]. 


6.3.7 Primitive Forms 


Atkin and Lehner have made an important complement to Hecke’s theory 
by constructing a satisfactory theory of Hecke operators for all m including 
the divisors of the level. We begin with a simple example (following [La76], 
[Fom77], [Gel75]). Consider the vector space Sj2(Io(2)) containing fi(z) = 
A(z) and fo(z) = A(2z). These two functions have the same eigenvalues for 
all Hecke operators T)2(p) with p 4 2. However they are linearly independent. 
A natural question then arises: which additional conditions must be imposed 
on f(z) = 3) a(nje(nz) € S,(N,~) so that it is uniquely determined by 
its eigenvalues A/(m) for (m, N) = 1. 
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In order to find such conditions one constructs first the subspace of old 
forms Se!4(,(N)) C Sx(I1(N)) as the sum of images of the operators 


Vd’) : SeT(N/d)) > S(T) 
(see (6.3.28)) for all divisors d of the level N, and for all divisors d’ of d. Set 
SEEN, W) = Se(N,) SQN). 


Then the vector space of new forms of level precisely N is defined to be the 
orthogonal complement of the old forms: 


Si(N, pb) = Spe (N, b) © SeA(N, p). (6.3.47) 


The main result of Atkin — Lehner theory is that if a function f € Spe’ (N,v) 
is an eigenfunction of all Hecke operators T;,(m) with (N,m) = 1, then f is 
uniquely determined (upto a multiplicative constant) by the eigenvalues and 
one can normalize f by the condition a(1) = 1. A primitive form of conductor 
N is then defined as a normalized new eigenform f € S?°’(.N,~w). For such 
forms f the condition f|U(q) = a(q)f for g|N is automatically satisfied. One 
has the following Euler product expansion: 


L(s,f) = )_a(n)n 
= [[@ -a(@)a-*)*" []G - ap? + pp"), (6.3.48) 
q|N ptN 


in which |a(q)| = q*~)/? if the character ~ can not be defined modulo the 
smaller level N/q, and if w is defined modulo N/q then a(q)? = v(q)q*! 
provided gq? { N, and a(q) = 0 otherwise (i.e. for q?|.N), cf. [Li75]. 

Let f(z) = 3729 a(n)e(nz) € S.(N,w) be a primitive cusp form of con- 
ductor Cy, Cy|N. If we put 


0 -l = = 
WED=(6, 9 )> MAID =L alrelne) € $1.9). 
then there is a complex number A(f) with |A(f)| = 1 such that 
flLW (Cr) = AP) SP. (6.3.49) 


Primitive cusp forms of a given conductor are characterized by the identity 
(6.3.49), which is equivalent to a certain nice functional equation for the cor- 
responding Dirichlet series (cf. [Li75]) 


L(s, f) = [[( — a(p)p* + o(p)p****) + 
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where 


a(p)a'(p) = %(p)p*~*, a(p) + a'(p) = a(p). (6.3.50) 
If we put 
A(s, f) = (20/\/Cs)*I(s)L(s, f), 
then this functional equation has the form 


A(k — s, f) = #*/*A(f)A(s, FP). (6.3.51) 


For a primitive Dirichlet character x whose conductor Cy is coprime to 
Cy, the twisted modular form 


f(z) = D2 x(n)a(n)e(nz) € Si(CpCZ, #) (6.3.52) 
n=0 
is a primitive cusp form of conductor C'sC (comp. with (6.3.51) and (6.3.45). 


6.3.8 Weil’s Inverse Theorem 


A converse statement concerning analytic properties of the series (6.3.52) was 
found by A.Weil [Wei67] giving necessary and sufficient conditions that a 
Fourier series f(z) = 07-9 a(n)e(nz) represents a modular form in M;(N, ~) 
in terms of the Dirichlet series 


A(s, f,x) = (20) *I(s) ¥) x(n)a(n)n-, 


where y is a Dirichlet series. 
If f € M,(N,v), and x is primitive modm let us consider the twist 


fe(2) = Do x(malne(na) = x(- J > ay (" *), 
n=0 a mod m 


where G(x) = >, moa m X(a)e(a/m) is the Gauss sum and one checks 
fez) € Si(Nm?, by’). 


Moreover if f|Wy = C) f?, then 


RWC CG (6.3.53) 
where 
G 
Cy = C1 Feax( NOC), (6.3.54) 
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alWw(2) =a (yg ) @) = NPP eal-1/N2), (6.3.55) 
g°(2) = glK(2) = g(-2) = Sal ghe(n2). 
n=0 


These statements follow from known properties of Gauss sums. 


Lemma 6.13. Let Gy(v) = S- x(a)e(ab/m), and consider a primitive 


a mod m 
Dirichlet character x modulo m. Then: 


(i) Go(x) = x(b)G(), 
(i) GOOG) = x(-1)m. 


We see that (6.3.53) is equivalent to the identity 
G(x) ay (™ 4) _ oo GOO ay (™ 
f= xO (GO  )=astl— YS xO ( 


a 
m 0m 
amodm amod™m 


) IKWrm?, 


and one may commutate the terms on the right as follows: 


a mod m 

cS yas (779) Wa = 
a mod m 

62D asic (™ 2) Wren, 


because of the equality f| Ce if |\KWnym2 = f\K (c ao) Wwnm2, and we re- 
place a par —a. Notice also that 
—b 


n 


SIR (7) 8) Wane = HK Wivr(a,8)(™) (a0) = (_, 


) € Io(N), 


because of the equality 


Ww-(a, b) i ’) o ) = es : ) Wwm2, —Nab= 1(modm). 


om 0m 0m 


Let us rewrite (6.3.53) in the following form (6.3.56): 


Go) So xf Gee (6.3.56) 


0m 


=O,60X-N) SY XO FKWwr(a,8) Car 


amodm 
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It follows that 
f|KWy = (-1)* f|WwK = (-1)*(Cif|K)K = (-1)* Cif, 


f|KWw(a,b) = (-1)*Cydbnf = (-1)*CT v(m) f 


hence (6.3.56) implies that C, is given by the above formula (6.3.54). 
Let us rewrite (6.3.56) in the equivalent form: 


YS MOA(P A) =o TY Wein (F 2) ssn 


a mod m a mod m 
using —Nab = 1(modN). 
The Mellin transform gives directly the equality: 
ke 


A*(s, f,x) = Cyi*(Nm?)2~°A*(k — 8, f?,X).- 


A converse statement concerning analytic properties of the series (6.3.52) 
was found by A.Weil [Wei67]. 


Theorem 6.14 (A. Weil, 1967). Suppose that a series 
R(s) = S- a(n)n~* 
n=1 


with complex coefficients a(n) converges absolutely for s = k— 6, 6 > 0, 
and that for every x mod m of prime conductor m, (m { N) the function 
A*(s, f,x) is entier, bounded in every vertical strip and satisfies the following 
functional equation: 


kL 


A*(s, f,x) = Cyi*(Nm?)2-SA*(k — 8, f?,X), 


Then 
f(z) = D5 a(n)e(nz) 
n=1 
represents a modular cusp form in S;,(N, 1). 
In other words, the properties of automorphy of such series can be deduced 


from the functional equations and the analytic properties of all the series 
(6.3.52) 
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Sketch of the proof. 


Let us consider the group algebra C[[GL* (2, R)]] of GL*(2,R), and its right 
ideal 


2, C C{[[GLT (2, R)]] 
consisting of all w € C[[GL*(2,R)]] such that f|w = 0. 
One has to show that for every y = he € Ipo(N) the element w = 
(Wd) — 7) = PAL — Yay) € 25. 
Let us consider first the elements y = y(a,b) = ae =) € Io(N) as 


above and let us show 
(1 — o(m)7(a, b)) € 2. 


for all y(a,b) = (255 <2 with m,n € S, where S is the set of all primes m 


for which the functional equation is satisfied for every y mod m. 


Lemma 6.15. The equality (6.8.57) is equivalent to (6.3.58): for all b, b' mod 
m, (b,m) = (b',m) = 1 one has 


(1 = d(m)(a,d)) ¢ ") (6.3.58) 


Proof of Lemma 6.15 is a simple verification: for all b’,b” mod m, (b',m) = 
(b',m) = 1 one can multiply the two parts of (6.3.57) by x(b') — y(b’) and 
summate over y, cf. op.cit. 


In order to finish the proof of Theorem 6.14, it sufficies to show that 


b 
1—w(a)y € 2, for all other elements 7 = ce) € Io(N). 


1b 
oly)? 


Next, let b=0, y= te) € Io(N). Then 


Gace creleaam 


First, let c= 0, y = ( then f = f|y because f is periodic of period 1. 


hence 


(-1)* f|K?Wn (, -) Wn = f|KWn kK & ) Wn = 


Cif| f _ KW = 0, Cuf =F, 
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in view of the known properties of K et Wy: KWy = (-1)*Wn kK, Wi = 
(—1)°, and from the fact that f|KWy = Cif which is equivalent to the 
functional equation with y = 1, m = 1. Finally, if y = ne ) Io(N), 6 40, 
let us choose s,t € Z such that m = a+ Nbs,n =d+ Nbt € S. Then 


= 1 O m b 1 O 
Y=\ —Ne1/] Nin] \—Ns1)? 


flr=¥(n)f = oF. 


This proves the automorphy property; it remains to chack the vanishing 
of f at the cusps of Ip(V). Let us use the absolute convergence of the series 


implying 


R(s) = S- a(n)n-* 


at s = k—6, 6 > 0. One easily deduces analytically that f(#+iy) = O(y~**°) 
for y > 0. This means that f is a cusp form in S;(N,v). 


6.4 Modular Forms and Galois Representations 


6.4.1 Ramanujan’s congruence and Galois Representations 


A new chapter in the theory of modular forms, and generally in arithmetic, was 
opened by Serre and Deligne, who discovered a link between modular forms 
and Galois representations. Their results have enhanced our understanding of 
a universal role played by modular forms in number theory, and have explained 
a whole series of mysterious facts concerning various arithmetical functions. 
Examples of these facts are the conjecture of Ramanujan—Petersson T(p) < 
2p'"/? for the Ramanujan function 7(p), and the congruence of Ramanujan 


r(n) = 5 —d" mod 691. (6.4.1) 
din 


The first result in this direction concerns the normalized cusp forms 
f(z) = So a(nje(nz) € S,(SL2(Z)) with k = 12,16, 18, 20, 22, 26, 
n=1 


when dim S;,(SL2(Z)) = 1. Serre conjectured (in [Se68a], [Se68b])), and 
Deligne proved (cf. [Del68]), that for each of the above cusp forms and for 
every prime number / there exists a continuous Galois representation 


(where K, is the maximal extension of Q ramified only at 1) with the property 
that the image of the p-Frobenius element F,, = pi(Fr,) for p 4 I has 
characteristic polynomial t? — a(p)t + p*~+, where a(p) is the p*® coefficient 
of f, and & is its weight. 

One can rephrase the statement on the characteristic polynomial as saying 
that the representation p; is Z—integral in the sense of §6.2.1, and the following 
equation holds: 


(1— a(d)i-* + -1-**) Ls, f) = L(p1, 8). (6.4.3) 


This result makes it possible to study congruences modulo a prime number 
1 for the coefficients a(n). It turns out that such congruences exist only when 
1 is exceptional for p;, i.e. when the image Im ps; does not contain SL2(Z,). In 
this case there are certain relations modulo | between the trace Tr F,,, = a(p) 
and the determinant detF,» = p*-! of the matrix F,,p. For example, in the 
case of the Ramanujan function t(n) we have k = 12, | = 691, and the image 
of p; mod I (modulo conjugation) lies in the subgroup of upper triangular 


matrices (4 .) mod I. One has 
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which implies t(p) = p'! +1 mod J, and by multiplicativity one obtains 
the congruence (6.4.1). H.P.Swinnerton—Dyer in [SwD73] gave the following 
description of the possible exceptional primes / for the above cusp forms: 


a) there exists an integer m such that a(m) = n™o,—1-2m mod | whenever 
n is a quadratic non-residue mod J; in this case 


7) * 
Lin = ( 0 in) mod [: 


m 


b) a(n) =0 mod | whenever n is a quadratic residue mod 1; 
c) p'-*a(p)? = 0,1,2,4 mod I. 


For the Ramanujan function t(n) the exceptional primes are: | = 2, 3, 5, 
7, 23, 691. 

The construction of the representation p; is based on methods from alge- 
braic geometry, in particular on the study of the /-adic cohomology groups of 
the Kuga-Sato variety Ef which is defined as the fiber product of w = k — 2 
copies of the universal elliptic curve Er — Xp over the modular curve 
Xp = H/T, © =SL2(Z) (cf. §6.3.2, and [Sho80]). 

The variety Et is defined over Q and its algebraic (and complex) dimension 
is equal to w+ 1 = k —1. Deligne has shown that the representation p; of 
Gal(Q/Q) occurs in the vector space HE (E%G, Q,); in other words one can 
associate to f a motive My of weight k — 1 which occurs in the cohomology 
of the Kuga-Sato variety. However, the construction of My requires many 
additional cohomological techniques ([Ja90] , [Scho90]). 

Ribet in [Ri77| has extended the results of Deligne to primitive modular 
forms of arbitrary level. 

The Galois representation p; = ps, attached to a cusp form f is irreducible; 
if on the other hand we take for f an Eisenstein series which is an eigenfunction 
of all Hecke operators, f € M,(.N,w) then it is not difficult to construct a 
reducible /—adic representation p; whose L-function is the Mellin transform of 
f, i.e. such that the characteristic polynomial of F,, , coincides with 


1 —As(p)p~* + b(p)p*-*-?* (fF |Te(p) = Ap(D)F) 


for | £ p,lp coprime to N. Formula (6.3.35) shows that if a(0) 4 0 then A(p) = 
1+u(p)p*-!. Hence one may take for p; the direct sum 16 (py @ x} "), where 
x1: Gal(Q/Q) = ZS (xi(Fry) = p) is the cyclotomic character and py : 
Gal(Q/Q) > Qc Q, is the one dimensional representation associated to w 
via the Kronecker—Weber theorem. For the Eisenstein series the Z—function of 
this representation coincides with 


¢(s)L(s —k+1,¥) =] [[Q—p*) — v(p)p* tt. 


Pp 
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6.4.2 A Link with Eichler—Shimura’s Construction 


The idea behind Deligne’s construction dates back to Eichler’s study of the 
zeta functions of modular curves; these functions can be characterized as the 
Mellin transforms of cusp forms of weight 2 (see [Eich54]). 

If I is a congruence subgroup, then there is a one-to-one correspondence 
between cusp forms f € S2(I’) and holomorphic differentials f(z) dz on Xp. 
Hence dim S2(I’) = g = g(Xr), where g(Xr) denotes the genus of Xp. 
Formulae for the genera of the curves Xr can be found in the book of Shimura 
(1971). If F = Io(N) then the notation Xp = Xo(N) is often used. A modular 
curve Xo(N) is an algebraic curve defined over Q, such that the Riemann 
surface X9(N)(C) is identified to the compact quotient [y(N)\H in such a 
way that 


Xo(N)(C) > Ip(N)\H, H = HUQU&; 
Yo(N)(C) > Io(N)\H 


for an affine algebraic curve Yo(N) which is an algebraic curve defined over Q 
(see §6.4.2). 

Let us choose a Hecke basis {fi,..., f,} and consider the corresponding 
Euler products L(s, f;) (see (6.3.37)). 

Eichler discovered that the zeta function of the modular curve Xo(N) has 
the form 


¢(s)¢(s — 1I)L(Xo(N), s)~ 


where the L - function L(Xo(N),s) coincides upto a finite number of Euler 
factors (corresponding to the divisors of the level N) with the product 


g 


[[4G. fi). 


i=l 


Recall that the Z-function L(X, s) coincides with the Z—function of the [—adic 
representation of Gal(Q/Q) on the Jacobian Jx = Jo(N) of the curve Xo(N). 

Using the Z—functions L(s, f;) one can also obtain a decomposition of the 
Jacobian into a product of simple Abelian varieties (upto isogeny): 


Jo(N) = A, ee OK A,. (6.4.4) 
One proves that the endomorphism algebra End A; ®Q = K; is a totally real 
extension of Q generated by the Hecke eigenvalues A¢(m) ((m,N) = 1) of a 
cusp form f(z) = )77—, a(n)e(nz) € So(Io(N)). One has 


L(Aj,8) = T] L(s, f?), 


where o runs through the embeddings o : K; — R, and 
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f(z) = S-a(n)%e(nz) € S2(Io(N)). (6.4.5) 
n=1 


In particular, [K; : Q| = dim 4A,;. 

A detailed exposition of these results can be found in the book of Shimura 
(1971). 

The especially interesting case g(Xo(NV)) = 1 arises only for N =11, 14, 
15, 17, 19, 20, 21, 24, 27, 32, 36, 49, when the vector space S2(Ig(N)) is 
generated by a single cusp form with integral Fourier coefficients. If N = 11 
then 


f(z) =n(z)?n(11z)?, where  (z) = e(z/24) II (l1—e(mz)) (6.4.6) 


3 
IL 


is the Dedekind 7-function. For N = 36 we have 


f(z) = n(12z)0(z), (6.4.7) 


where 0(z) = >¢,,¢7 e(n?z) is the theta function (cf. [Frey86]). 


6.4.3 The Shimura—Taniyama—Weil Conjecture 


is discussed in more detail in Chapter 7 in connection with Fermat’s Last 
Theorem which have been completely proved in [Wi], [Ta-Wi], together with 
the Shimura-Taniyama—Weil conjecture (the STW conjecture) for semi-stable 
elliptic curves (cf. also [CSS95] [Tan57], [Wei67], [Frey86], [Gel76]). The STW 
conjecture was proved in full generality in 1999 by Ch.Breuil, B.Conrad, 
F.Diamond and R.Taylor (cf. [Da99] and Chapter 7 for relevant techniques). 

An elliptic curve E defined over Q is called a modular elliptic curve (a 
Weil curve) if there exists a non constant morphism yy : Xo(N) — E. The 
Shimura~—Taniyama-—Weil conjecture says that every elliptic curve over Q is 
modular. The smallest number N with this property is called the analytic 
conductor of E. In this case E has good reduction modulo all primes p not 
dividing N, and its L-function coincides with the Mellin transform of a cusp 
form f € S9(Ip(N)): 

L(E,s) = L(s, f). 


In particular, the function L(E,s) admits an analytic continuation to the 
entire complex plane and satisfies a functional equation of the type 


A(E, s) = (20/VN)~*I'(s)L(E, s) = 4A(E,2 — s). 


This conjecture seems to be both very natural, and surprising since it estab- 
lishes a correspondence between two quite different kinds of object: elliptic 
curves over Q and primitive cusp forms of weight 2 with integral coefficients. 
Before Wiles’ proof, the conjecture has been verified for a number of curves, 
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in particular for all curves with complex multiplication. In the latter case the 
l-adic Galois representation on the Tate module turns out to be Abelian, and 
one proves first that its Z-function coincides with the L-function of a Hecke 
character of the corresponding imaginary quadratic field (“Gréssencharakter”). 
The analytic continuation and functional equation of these functions is known 
(see 6.2.4), so it follows from Weil’s inverse theorem that L(E,s) = L(s, f) for 
some primitive cusp form f of weight 2, which is equivalent to the Shimura— 
Taniyama—Weil conjecture. In the above examples the curve X0(36) admits 
complex multiplication by Q(,/—3), whereas the curve Xo(11) has no complex 
multiplication. 


The Shimura—Taniyama—Weil conjecture has a number of interesting arith- 
metical corollaries, in particular concerning Fermat’s last theorem (cf. [Wil, 
[Ta-Wi], [Ri], [CSS95], [Frey86], [Se87] and Chapter 7). 

There is an analogue of the Shimura—Taniyama-—Weil conjecture describing 
all simple Abelian varieties with the property that the degree of the endomor- 
phism algebra over Q coincides with the dimension of the variety. These vari- 
eties are thought to correspond to simple factors of the Jacobians of modular 
curves [Se87], [Wei71]. 


6.4.4 The Conjecture of Birch and Swinnerton—Dyer 


(cf. [BSD63], [Ta65a], [Man71], [CW77], [Koly88], [Rub77], and [CRO1] for 
a recent progress). This deep conjecture gives a relation between the most 
important arithmetical invariants of an elliptic curve & over a number field 
K, and the analytic behaviour of the L-function L(E,s) = L(E/K,s) at the 
point s = 1. These arithmetical invariants are: rg = rk E(K) (the rank of 
E over K), E(K)*** the torsion subgroup, Rg the regulator of E, i.e. the 
determinant of the Néron—Weil pairing hg on 
E(K)/E(K)°" CR’, 


and the Shafarevich—Tate group II(E, kK) of E over K. By definition 


L(E,s) =|] Lu(E,s), (6.4.8) 
where 
L,(B,8)") =1— a(py)No-® + Nol, 
a(py) = Nv +1— Card E(Ox/py) 


for all places v where E has good reduction, and 


L,(E,s)~' =1—a(p,)Nv~*, a(py)=+1 or 0 


for places v with bad reduction, according to the type of bad reduction of 
E mod fy. Here it is assumed that EF is defined over Ox, and that it coincides 
with its Néron model (minimal model) (cf. 85.2). 
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In view of the “Riemann conjecture over a finite field” (see (W4) of §6.1.3) 
the following estimate holds: |a(p,)|, < 2WNv, which implies the absolute con- 
vergence of the series L(E,s) for Re(s) > 3/2. However, in order to formulate 
the conjecture we need stronger analytic properties: we assume the hypothesis 
of Hasse—Weil on the existence of an analytic continuation of L(E,s) to the 
entire complex plane. 

The conjecture of Birch and Swinnerton—Dyer (BSD) consists of two parts 
(cf. [Bir63], [BSD63]): 


a) the order of the zero ng = ord,=1L(E,s) coincides with the rank rz 
b) assume that the Shafarevich-Tate group of E over K is finite; then as 
s — 1 the following asymptotic formula holds: 


Rel (£,K)| 
L(E,s) ~ 1)"= 6.4.9 
( , 8) (s ) |E( ic )tors|2 : ( ) 
where M = [],< sp, Mw 1s an explicitly written product of local Tamagawa 


factors over the set Sg of all Archimedean places and places where FE has 
bad reduction, my = Jauc ) lv, w being the Néron differential of E. 


For example, if kK = Q an elliptic curve F can be defined by the equation 
y? +ayzy t+ agy = 2? +agx7+agr+ag (a; € Z), (6.4.10) 


which is minimal in the sense that the absolute value of its discriminant is 
minimal; in this case the Néron differential has the form dx/(2y + a, + a3) 
([Silv86]). 

The BSD conjecture is closely related to the Shimura~Taniyama-—Weil con- 
jecture, because the analytic properties of the functions L(E,s) = ~~, a(n)n 
and L(E,x,8) = >>, x(n)a(n)n~$ imply the modular properties of the cor- 
responding functions f(z) = 7°, x(n)a(n)e(nz) in view of the inverse the- 
orem of Weil (see section 6.3). In all known cases the following functional 
equation holds: 


—s 


A(E, s) = (2n/VN)~*I'(s)L(E, s) = e(E)A(E,2— 8), (6.4.11) 


where the number ¢(£) = +1 is called the sign (or the “root number”) of E. 
These are the Weil curves for which some partial results on the validity 
of the BSD conjecture are known. Let y : Xo(N) — E define a Weil curve E 
of conductor N, and let w be the Néron differential of E. Then the pullback 
y*w coincides upto a sign with the differential 
dq 
Ce = 2nif(z)dz on X(N), 


where f € So(Ip(N)) is a primitive cusp form of level N. One has L(E,s) = 
L(f,s) and the following equation holds 
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L(E,1) = an fe f (iy) dy. (6.4.12) 


The integral is absolutely convergent in view of the exponential decay of f(z) 
as y — oo or y — 0; it coincides essentially with the Tamagawa factor mo. 
from (6.4.9). 


Theorem 6.16 (Coates J., Wiles A. (1977)). Let E/K be an elliptic 
curve with complex multiplication and let K be either Q or the complex mul- 
tiplication field. Then the condition rg > 1 implies that ng > 1, 1.e. that 
L(E,1) =0. 


The proof of this theorem is based on an explicit calculation of the special 
value (6.4.12), which is upto a rational multiplicative constant equal to the 
Tamagawa factor mo. From the existence of a point of infinite order it follows 
that this multiplicative constant is divisible by infinitely many primes, and is 
hence zero. 

A generalization of this result in another direction was found by R. Green- 
berg (cf. [Gr83]): let E/Q be an elliptic curve with complex multiplication. If 
the number nz is odd then either the group II(£, Q) is infinite and contains 
a divisible group Q,/Z, for every good reduction prime p # 2,3 (ie. for 
which F mod p is an elliptic curve with a non-trivial point of order p over 
F,,). Developing these ideas K. Rubin and V.Kolyvagin constructed examples 
of curves E'/Q with complex multiplication and with finite Shafarevich—Tate 
groups II(F,Q). He also proved the following deep fact: if for such a curve 
one has rg = re(Q) > 1 and e(£) = —1, then ng > 3. For example, the 
curve E : y* = x? — 226z has rank 3 (generators modulo torsion are:(—1, 15), 
(—8,96), (121/4,1155/8)), and e(£) = —1; hence ng > 3 (comp. with the 
examples in §6.3.2). These results where extended by Kato to curves without 
complex multiplication, using Euler systems, cf. [Kato2000], [MazRub04]. 

Although these results concern [—functions of a complex variable, they use 
a lot of p-adic theory and properties of p-adic L-functions. Neither the do- 
main of definition nor the set of values of these L—functions are complex; they 
are p-adic. These [-functions make it possible to control the p-adic behav- 
iour of special values of the type L(E,x,1) (where y is a Dirichlet character) 
which are algebraic numbers (upto a multiplicative factor of the form mx) 
and may thus be regarded as p-adic numbers. Also, the p-adic [-functions 
describe the behaviour of the Selmer groups and the Shafarevich—Tate groups 
under Abelian extensions of the ground field AK, which is either Q or the 
complex multiplication field of the given elliptic curve [Man71], [Man76], 
[Man78], [[wa72], [Coa89], [MW83], [MSD74], and for recent progress, [Coa01], 
[Colm03], [CM98]. 

An important development of the BSD conjecture for modular curves (in- 
cluding curves with complex multiplication) was obtained by Gross and Zagier 
(cf. [GZ86], [GKZ87], [Coa84]). They proved for elliptic Weil curves E that 
if ng = 1 then rg > 1. They established furthermore the existence of el- 
liptic curves E'/Q for which ng > 3. These results are based on the theory 
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of special points on modular curves (Heegner points). It was already known 
in the nineteenth century how to construct solutions to Pell’s equation using 
either special values of trigonometric functions (Dirichlet), or special values 
of the Dedekind eta-function (Kronecker) (see Part I, §1.2). Heegner in his 
work [Hee52] successfully used special values of elliptic modular functions to 
find rational points on elliptic curves, making it possible to find effectively 
all imaginary quadratic fields with class number one. In the work of Birch, 
[Bir75], extending and clarifying the ideas of [Hee52], the existence of ra- 
tional points of infinite order on certain elliptic curves was first established 
without explicit evaluation of the coordinates of these points, and without 
verification that these points indeed satisfy the equation of the given curve. 
Let y : Xo(E) — E be a Weil parameterization of a given elliptic curve 
E/Q. As was noted above (cf. §6.3.2) the set H/Ig(N) C Xo(N)(C) parame- 
trizes the isomorphism classes over C of isogenies E, > E,/(P) with cyclic 
kernel (P) where E,(C) = C/(1,z) is a varying elliptic curve associated to 
the point z € H. Let K be an imaginary quadratic field of discriminant 
D < 0 with maximal order O. Suppose that there exists an ideal 1 C O 
such that O/i = Z/NZ (this condition is satisfied for example, when D = 
square (mod4N) and (D,2N) = 1). Then one can associate to the isogeny 
C/O > C/i! a point z on H/Ip(N), and it is not difficult to verify that 
this point is rational over the Hilbert class field Hx (the maximal unramified 
Abelian extension) of K. The point y(z) = y € E(Hx) is called the Heegner 
point on E; therefore the point yx = )igccaHg/K)¥° © E(K) is defined 
over Kk. Birch and Stephens made extensive calculations of Heegner points 
in order to find out under which assumptions the point yx has infinite or- 
der. They suggested a conjecture, expressing for L(£,1) = 0 the special value 
L’(E,1) in terms of the product of m., and the Néron—Tate height h(yx) of 
yx (cf. [BS83]). This conjecture was proved by Gross and Zagier ([GZ86]). 

A further significant extension of these results is contained in works of V. 
A. Kolyvagin in [Koly88]. He proved that if L(E£,1) 4 0 and yx has finite 
order then the groups E(Q) and IlI(F,Q) are finite, proving the first part 
of the BSD conjecture. The methods developed by V. A. Kolyvagin make it 
possible to find effectively in terms of kK, E and yx the smallest positive 
integer annihilating the groups E(Q) and II(£,Q). Thus one also has an 
approach to proving the second part of the BSD conjecture. The theory of 
Euler systems due to V. A. Kolyvagin, cf. [Koly90], see also [MazRub04], also 
allows one to consider from a unified point of view Gauss sums, elliptic units, 
cyclotomic units and Heegner points, and it gives an approach to proving the 
“main conjecture” of Iwasawa theory (see §5.4.5) which describes the Iwasawa 
modules attached to these objects in terms of p-adic L—functions. 


Heegner points, provide a lot of points on elliptic curves defined over ring 
class fields of imaginary quadratic fields. 

Mazur formulated a number of conjectures concerning the variation of 
Mordell-Weil groups in towers of ring class fields with restricted ramification 
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(cf. [Maz83], [MazRub03]) that Heegner points should account, in a suitable 
sense, for a majority of points rational over the fields in the tower (assuming 
that the elliptic curve does not have complex multiplication by the imaginary 
quadratic field in question). 

Vatsal [Vats03] and Cornut [Cor02] proved a version of this conjecture. 


A remarkable fact is that it is sometimes quite easy to verify that the 
special value L(E,1) vanishes, but it is extremely difficult to find a point of 
infinite order on a curve E. In the example of Cassels-Bremner E : y? = 
x? + 877x (cf. [Cas84], [Cas66]) the vanishing of the special value L(E,1) 
follows from the fact that E is odd; on the other hand the generator of the 
group £(Q)/E(Q)'*s = Z looks very complicated (see Part I, §1.3.2). 

The results of Gross and Zagier found an unexpected application to 
Gauss’s famous problem of finding all the imaginary quadratic fields Q(/—d) 
with class number h(—d) equal to a given number h. Previously these fields 
had been found explicitly in the cases h = 1 and h = 2 ([Hee52] , [Abr74], 
[Ba71], [Deu68], [St67], [St69]). 

In 1976 Goldfeld showed in [Gol76] that if there is a cusp form f € 
S(Io(N)) whose Mellin transform has a zero at 1 of essentially high order (3 
or 4, depending on k and N) then for any positive integer h > 1 one can find 
an effective upper bound for d such that h(—d) = h. The desired cusp form 
has since been found: Mestre showed in |[Me85] that the elliptic curve 


yt+y=2°—7r+6 (6.4.13) 


of conductor 5077 and rank 3 (E(Q) & Z? with generators (1,0), (6,0), (0,2)) 
is a Weil curve, i.e. its L-function L(E,s) = >>, a(n)n~* is the Mellin 
transform of some cusp form f(z) = )>™~, a(n)e(nz) € S2(Io(5077)). From 
the results of Gross and Zagier and from the fact that E is odd (i.e. e(E’) = —1) 
one deduces that ng > 3, so f is a cusp form with the required properties. 
The use of f in the theorem of Goldfeld makes it possible to prove that for a 
positive integer T’ > 1 there exists an effective constant B(T) > 0 such that 
if d possesses T' different prime divisors, then the following estimate holds: 
h(—d) > B(T) log d. If d is a prime then dlogd < 55h(—d). Using this result 
all d with h(—d) = 3 were found, cf. [Oe83]. 

For recent developments concerning Gross-Zagier formulas, we refer to 
[BeDa97]| and [Borch99]. 

A fruitful use of Z-functions of elliptic curves and modular forms was 
demonstrated in the work J.Tunnell (cf. [Tun83], [Frey86]) on a classical 
Diophantine problem concerning congruent numbers. A natural number N 
is called a congruent number if it is the area of some right angle triangle, all 
of whose sides have rational lengths. For example, the number 6 is congruent 
as it is the area of the Egyptian triangle with the sides 3, 4, and 5. It turns 
out that the smallest congruent number is 5 which is equal to the area of the 
triangle with sides 3/2, 20/3, 41/6. The fact that N = 1 is not congruent 
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provides an excellent example of use of Fermat’s infinite descent argument 
and also proves Fermat’s Last Theorem for the exponent 4. 
Indeed, suppose that Z > Y > X > 0 are rational numbers satisfying 


1 
ae Ge So SAY =N. (6.4.14) 

From these equalities we obtain 
(X+VYP=Z77+4N, (X-Y)?=Z?-AN, (6.4.15) 


Multiplying the equations (6.4.15) together shows that the positive integers 
u=Z/2 and v = (Y? — X?)/4 satisfy the equation 


v=ut— N°. (6.4.16) 


Put now N = 1 and write the numbers u,v > 0, u,v € Qin the form: u = a/b, 
v = c/d, where a and b are coprime and c and d are coprime. As a result one 
obtains from (6.4.16) 


ob = ath? — bd? = (a* — B*)- a’, 


Taking into account the fact that GCD(c,d) = 1 and GCD(a* — b*,b*) = 1, 
we see that b+ = d?, ice. 


at — bt =e’. (6.4.17) 


We now rewrite (6.4.17) in the form (a? — c)(a? +c) = b* and note that a 
prime number dividing both numbers a? — c and a? + ¢ divides also 2a? and 
2c. This implies that GCD(a? — c,a? +c) = 1 or 2 by the coprimality of a 
and b. However, the product of these factors is a fourth power, so we have the 
following two possibilities: 


a2 —c =2C4 ae a? —c = 8D* 
a? +ce= 8D” a? +c = 2C*’ 


where C > 0, C odd, and GCD(C,D) = 1. In both cases one has a? = 
C* +4D*, ie. D* = (a — C?)(a + C?). Now considering the factors a — C? 
and a+ C?, we see that a+ C? = 2A* and a— C? = 2B? This in turn implies 
that the natural numbers A,B,C satisfy the relation A+ — B4 = C?, and 
max{A, B,C} < max{a, b,c}. We have reached a contradiction. 

On the other hand it is not difficult to see that the curve (6.4.16) is bi- 
rationally isomorphic to a plane cubic curve E having Weierstrass form 
y? = x — Nx. In order to show this one uses the substitution 


X=(N* -—2")/y, Y=2N2/y, Z=(N? +27)/y. 


Reducing modulo primes shows that the points of finite order on E% (Q) are 
precisely those for which y = 0, together with the point at infinity. Thus 
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one obtains the remarkable fact that N is congruent if and only if the group 
EN (Q) is infinite. 

J.B.Tunnell proved in 1983 that if an odd natural number N is a congruent 
number then 


Card{(z,y,z) € Z° | 227 +y? +822 =N}= 
2Card{ (x,y,z) € Z | 2x7 + y? +3227 = N}, 


Assuming the Birch-Swinnerton—Dyer conjecture for the curves E he showed 
that this condition is also sufficient. 

In connection with the Shimura—Taniyama—Weil conjecture we point out 
the result of G.V.Belyi, [Be79]: every algebraic curve defined over a number 
field can cover the projective line with ramification points only lying above 
0, 1, and oo, the cover being defined over a number field. From this result it 
follows in particular that every elliptic curve over Q admits a parameterization 
by modular forms with respect to a subgroup of finite index in SL2(Z) (which 
is not necessarily a congruence subgroup). This result made it possible to 
solve the embedding problem over certain cyclotomic extensions of Q, and to 
construct Galois extensions of such fields whose Galois groups are given finite 
simple groups with two generators. Previously the embedding problem over 
Q was solved by I.R.Shafarevich for all finite solvable groups in [Sha54]. 


6.4.5 The Artin Conjecture and Cusp Forms 


(cf. [.DS75], [L71b], [L80], [Tun81] [Hen76], [Hir88], [Gel95]). The correspon- 
dence between primitive cusp forms of weight 2 with respect to Ip(N) and 
elliptic curves given above by the Shimura—Taniyama—Weil conjecture (see 
also Chapter 7), has a remakable analogue in the case of cusp forms of weight 
1. Langlands has conjectured, and Deligne and Serre have precisely formu- 
lated a link between primitive cusp forms f(z) = }>,,_, a(n)e(nz) € Si (N,v) 
and irreducible two dimensional complex Galois representations 


pr: G(Q/Q) = GL2(C). (6.4.18) 


The condition that f is primitive includes the conditions 


fliT(e)=ap)f, a)=1 ((p,N) = 1). (6.4.19) 


Deligne and Serre proved the existence of irreducible representations pf un- 
ramified outside the divisors of N such that 1) detps = py is a one dimensional 
Galois representation which corresponds via the Kronecker—Weber theorem to 
an odd Dirichlet character 7 : (Z/NZ)* — C*, ~(—1) = —1, and 2) the im- 
age Tr F,,» coincides with a(p) for all p, (p,N) = 1. For Eisenstein series 
f € Mi(N,~) with conditions (6.4.19) such Galois representations can be 
easily constructed and turn out to be direct sums of Dirichlet characters. 
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A remarkable consequence of the construction is the proof of the Ramanujan- 
Petersson conjecture in the case of weight k = 1: 


|a(p)| < 2p@-Y/? = 2, (6.4.20) 


Indeed the number Tr F,,,» = a(p) is the sum of two roots of unity, and 
the estimate (6.4.20) therefore holds for Eisenstein series (however, for weight 
k; = 1 this is also true for the cusp forms). 

The Ramanujan—Petersson conjecture is related to the Sato—Tate Con- 
jecture on the uniform distribution of the arguments y, of Frobenius auto- 
morphisms in the segment [0,7] with respect to the measure — sin? y dy (cf. 
Chapter I in [Se68a], [Mich01] and 86.5.1). - 

The construction of pf uses the reduction mod !| of the [-adic representa- 
tions attached to modular forms of weight k > 2. First one proves that a given 
cusp form of weight 1 has the same Fourier coefficients modulo a prime ideal, 
as a cusp form g of a higher weight & > 2. Then one verifies that the [-adic 
representation p, mod ! can be lifted to characteristic zero, and one gets as 
a result the desired complex representation py, for which properties 1) and 2) 
are valid since they are satisfied modulo infinitely many prime numbers. 

The conditions 1) and 2) on the representation py mean that the L-series 
L(s, f) (the Mellin transform of f) coincides with the Artin L-series of the 
representation pf, i.e. 


L(s, f) = L( pz, 8), (6.4.21) 


and the analytic properties of L(py,s) follow from those of L(s, f) described 
in 6.3.3. For complex representations p : G(Q/Q) — GL,(C) the statement 
that L(p,s) is holomorphic is known as the Artin conjecture. It follows that 
this conjecture is true for representations of the type p = pr. Conversely if 
one knows for a two dimensional representation p with odd determinant detp 
that the functions L(p ® y, s) for all Dirichlet characters y are holomorphic, 
then the existence of a cusp form f satisfying (6.4.17) can be deduced from 
Weil’s inverse theorem (see §6.3.3). It turns out that the image p(G(Q/Q)) of 
a two dimensional irreducible representation p in PGL2(C) is always one of 
the following groups: 


1) a dihedral group in which case the representation p is monomial (i.e. in- 
duced from a character of a cyclic subgroup); 

2)A4 (tetrahedral case); 

3) S4 (octahedral case); 

4) As (icosahedral case). 


Langlands and Tunnell proved the conjecture on the existence of a cusp 
form f for which p = pf in cases 2) and 3). The validity of the Artin conjecture 
in case 4) remains unknown in general. However J. Buhler (see in [Hen76]) 
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gave an example of a representation of type 4) for which the Artin conjecture is 
valid, as well as the existence of the corresponding cusp form. In this example 
Kerp = G(Q/K) where K is the splitting field of the polynomial 


x + 10x? — 10x? + 35a — 18, 


and N = 800. An interesting program was outlined by R.Taylor to solve 
the Artin conjecture for icosahedral Galois representations (see [Tay02] for a 
recent progress). 


The Artin conductor 


The number N in the construction of Serre and Deligne has an interpretation 
as the Artin conductor of the representation py, which is defined for every 
finite dimensional Galois representation p : Gg — GL(V) with finite image as 
follows. Let p be a prime, p a prime ideal of the ring O C Q of all algebraic 
integers, p € p. Then the image of the decomposition group 


G®) = {7 € Gg | op =p}, 


is isomorphic to the Galois group of some finite extension F'/Q,, p(G®)) = 
G(F/Q,). Let vp be the normalized p-valuation of F,, i.e. ur(£™) = Z. Define 
the ramification groups 


Gpi = {o € G(F/Q | vrF(# — o(a)) > i for all ce € Of} 


and let V, i = = V+. In particular Go is the inertia subgroup, and the fact 
that p is unramified over p is equivalent to saying that V = V, 0. 
Then the Artin conductor is defined (cf. [Ar30], [Ar65], [$e63]) by 


= N(p) =|], (6.4.22) 
Pp 


where the exponent n(p, p) is defined by 


n(p, p) = a GED 


dim V/V, 0. (6.4.23) 

j= STP: ¢ pi) 
This turns out to be an integer (at first sight it only looks like a rational 
number). One has n(p,p) = dim V/V,,, + bp(V), where the number b,(V) 
is called the wild invariant of the representation p over p. One can show that 
for one dimensional representations the Artin conductor coincides with the 
conductor of the corresponding character of the idele class group, attached to 
it by class field theory (cf. [Se63]). 
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6.4.6 Modular Representations over Finite Fields 


(cf. [Se87]). Based on a deep analysis of previous constructions, Serre suggested 
in 1987 a universal description of all two dimensional Galois representations 
over finite fields in terms of cusp forms. Let p be a prime number, p a prime 
ideal of the ring of all algebraic integers O C Q dividing p (i.e. p € p). We call 
a representation 


p2 Gg GLG;F,) (6.4.24) 


a modular representation of type (N,k,w), if for some modular form 
f(z) =} a(nje(nz) € SiN, 4), 
n=1 


which is an eigenform of the Hecke operators normalized by a(1) = 1, the 
following condition is satisfied 


Tr(F,1) =a(l) mod p (6.4.25) 


for all primes | { Np. 

Serre conjectured that every irreducible representation (6.4.24) is modular 
for some N not divisible by p. He also described explicitly the numbers N and 
k and the character w, assuming that N and k are minimal under the condition 
(N,p) = 1. According to this conjecture the number N is determined by the 
ramification of p outside p in the same way as the Artin conductor: 


N = N(p) = [[ i. 


lAp 


The weight & is defined by ramification properties of p at p, and the character 
is determined by the following condition on the determinant of p: 


detp(Frob;) = W(1)Ik~! mod p (1 / Np). 


Serre gave many concrete examples of representations p for which the corre- 
sponding cusp form f with N, k and w as predicted by the conjecture, can be 
explicitly constructed. 

We point out some remarkable consequences of this conjecture. First of all 
it implies the validity of the Shimura~Taniyama—Weil conjecture for elliptic 
curves over Q and for simple Abelian varieties with real multiplication (see 
§6.4.3 and Chapter 7). Also this conjecture would imply Fermat’s last theorem. 
This corollary can be shown using the elliptic curve of Frey E : y? = x(x — 
A)(x + B), where 


A=a?’, B=), C=C a,b,cEZ, p>5d5 
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are integers satisfying the condition A+ B+C=0, ABC £0 (a non-trivial 
solution of Fermat’s equation), see Chapter 7. 
According to this conjecture of Serre, the Galois representation 


p:Gg — Aut E, = GLo(F,) 


on points of order p of the elliptic curve of Frey—Hellegouarch should corre- 
spond to a cusp form f € S2(Ip(2)). However, 


dim S2(To(2)) = g(Xo(2)) = 0, 


hence such cusp form cannot exist. 

Another approach to proving the non-existence of the elliptic curves of 
Frey—Hellegouarch consists of applying to the corresponding “arithmetic sur- 
face” (a scheme over Spec Z of dimension 2) an analogy of a result on non- 
singular projective surfaces of general type over an algebraically closed field 
of characteristic zero (the inequality of Bogomolov-Miyaoka-Yau, [Miy77], 
[Par87]. 


6.5 Automorphic Forms and The Langlands Program 


6.5.1 A Relation Between Classical Modular Forms and 
Representation Theory 


(cf. [Bor79], [PSh79]). The domain of definition of the classical modular forms 
(the upper half plane) is a homogeneous space H = {z € C | Im z > 0} of the 
reductive group G(R) = GL2(R): 


H = GL2(R)/O(2) - Z, 


where Z = {( 
group, see (6. 


) |x € R*} is the center of G(R) and O(2) is the orthogonal 


x 0 
Ox 
.1). Therefore each modular form 


Co 


f~™= S- a(n)e(nz) € Mg (N,v) C Mz (Ly) (6.5.1) 


n=0 


can be lifted to a function f on the group GL2(R) with the invariance condition 


F(vg) = f(g) for all y € Ty C GL2(R). 


In order to do this let us consider the function 


x). f9@)ig.)" if detg > 0, 
Bee oe ~i)-* if detg <0, (6.5.2) 


where g = 6 _) € GL2(R) and j(g,i) = |detg|~'/?(cz + d) is the factor of 


automorphy. 

One has f(g) = exp(—ik6)f(g) if « = (S892?) is the rotation 
through the angle 0. 

Consider the group GL2(A) of non-degenerate matrices with coefficients 


in the adele ring A and its subgroup 


U(N) (6.5.3) 


10 
= {2 1x I[» € GL2(A) | gp € GLe(Zp), gp = (, ') mod wa, 
p 


From the chinese remainder theorem (the approximation theorem) one obtains 
the following coset decomposition: 


Py\GLa(R) © GL2(Q)\GL3(A)/U(N), (6.5.4) 


using which we may consider f as a function on the homogeneous space (6.5.4), 
or even on the adele group GL2(A). 
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The action of GL2(A) on f by group shifts defines a representation 7 = 7 f 
of the group GL2(A) in the space of smooth complex valued functions on 
GL2(A), for which 


(a(n) F)(g) = Fgh) (g,h € GL2(A)). 


The condition that the representation a, be irreducible has a remarkable 
arithmetical interpretation: it is equivalent to f being an eigenfunction of the 
Hecke operators for almost all p. If this is the case then one has an infinite 
tensor product decomposition 


r=, (6.5.5) 


where the 7, are representations of the local groups GL2(Q,) with v = p or 
oo. 

Jacquet and Langlands chose irreducible representations of groups such as 
GL2(Q,) as a starting point for the construction of Z—functions (cf. [JL70], 
[Bor79]). These representations can be classified and explicitly described. Thus 
for the representations 7, in (6.5.5) one can verify for almost all v = vp 
that the representation 7, has the form of an induced representation 7, = 
Ind({11 @ 42) from a one dimensional representation of the subgroup of diagonal 


matrices 
x20 


(141 ® 2) a 4) = p2(x) p(y), 


where pi : QF — C% are unramified quasicharacters (see §6.2.4). This 
classification makes it possible to define for almost all p an element hy, = 


C7 u(p ) eo) € GL2(C). From this one can construct the following Euler prod- 


uct (the Z-function of the automorphic representation 7) 


L(m,s) = || L(t, s) = |] det(de — p*hy)! (6.5.6) 


pEs pEs 


in which the product is extended over all but a finite number of primes. 
It turns out that the function L(z, s) coincides essentially with the Mellin 
transform of the modular form f: 


L(s, f) = L(ay, 5 + (k — 1)/2). 


The notion of a primitive form f also takes on a new meaning: the correspond- 
ing function f from the representation space of an irreducible representation 7 
must have a maximal stabilizer. The theory of Atkin—Lehner can be reformu- 
lated as saying that the representation 7 occurs with multiplicity one in the 
regular representation of the group GL2(A) (the space of all square integrable 
functions). 
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More generally, an automorphic representation is defined as an irreducible 
representation of an adele reductive group G(A) in the space of functions on 
G(A) with some growth and smoothness conditions. 

Jacquet and Langlands constructed for irreducible admissible automorphic 
representations 7 of the group GL2(A) analytic continuations of the corre- 
sponding L-functions L(z,s), and established functional equations relating 
L(m,s) to L(#,1 —s), where 7 is the dual representation. For the functions 
L(my,s) this functional equation is exactly Hecke’s functional equation (see 
(6.3.44)). 

Note that the notion of an automorphic representation includes as special 
cases: 1) the classical elliptic modular forms, 2) the real analytic wave modular 
forms of Maass, 3) Hilbert modular forms, 4) real analytic Eisenstein series 

! ye 
of type > jae ta” 
their inverse Melin transforms), 6) automorphic forms on quaternion algebras 
etc. 

Interesting classes of Euler products are related to finite dimensional com- 
plex representations 


5) Hecke L-series with Gréssen—characters (or rather 


r: GLe(C) — GL, (C). 


Let us consider the Euler product 


L(a,7r,s) = [[£@.7. 8), (6.5.7) 


where 
L(tp,7, 8) = det(1m — p> rh) 


These products converge absolutely for Re(s) >> 0, and, conjecturally, ad- 
mit analytic continuations to the entire complex plane and satisfy functional 
equations (cf. [Bor79], [BoCa79], [L71la], [Del79], [Se68a]). 

This conjecture has been proved in some special cases, for example when 
r = Sym'St is the i” symmetric power of the standard representation St : 
GL2(C) — GLa(C) for i = 2,3,4,5 (cf. [Sh88}). 

The Ramanujan—Petersson conjecture, proved by Deligne, can be formu- 
lated as saying that the absolute values of the eigenvalues of h, € GL2(C) for 
a cusp form f are all equal to 1. 

As a consequence of the conjectured analytic properties of the functions 
(6.5.7) one could deduce the following conjecture of Sato and Tate about 
the distribution of the arguments of the Frobenius elements: let a(p) = e’?» 
(0 < gy < 7) be an eigenvalue of the matrix h, defined above. Then for 
cusp forms f without complex multiplication (i.e. the Mellin transform of f is 
not the L-function of a Hecke Gréssencharacter (see §6.2.4) of an imaginary 
quadratic field) the arguments y, are conjecturally uniformly distributed in 


2 
the segment [0,7] with respect to the measure = sin? y dy (cf. [Se68al). 
7 


6.5 Automorphic Forms and The Langlands Program 335 


In the case of complex multiplication the analytic properties of the L- 
functions are reduced to the corresponding properties of the L-functions of 
Hecke Gréssencharacters (see §6.2.4), which imply the uniform distribution of 
the arguments y, with respect to the usual Lebesgue measure. 

The arithmetical nature of the numbers e**? is close to that of the signs 
of Gauss sums a(p) = g(x)/,/p where g(x) = eS x(u)e(u/p), x being a 
primitive Dirichlet character modulo p. Even if y is a quadratic character, the 
precise evaluation of the sign a(p) = +1 is rather delicate (see [BS85]). If x 
is a cubic character, i.e. if x? = 1 then p = 6t + 1, and the sums lie inside 
the Ist, the 3rd or the 5th sextant of the complex plane. Using methods from 
the theory of automorphic forms S.J.Patterson and D.R.Heath—Brown solved 
the problem of Kummer on the distribution of the arguments of cubic Gauss 
sums by means of a cubic analogue of the theta series, which is a certain auto- 
morphic form on the threefold covering of the group GL ([Del80a], [HBP79], 
[Kub69]). 


6.5.2 Automorphic D-Functions 


The approach of Jacquet-Langlands made it possible to extend the whole 
series of notions and results concerning L-functions to the general case of 
automorphic representations of reductive groups over a global field kK. Let G 
be a linear group over K, Gag = G(A) its group of points with coefficients in 
the adele ring of the field K. Automorphic representations are often defined as 
representations belonging to the regular smooth representation of the group 
Ga, and one denotes by the symbol 2(G/K) the set of equivalence classes of 
irreducible admissible automorphic representations of G4. A representation 7 
from this class admits a decomposition 7 = ®,7y where v € X’x runs through 
the places of K and the 7, are representations of the groups G, = G(K,). In 
order to construct L-functions, the L-group ’G of G is introduced. Consider 
the tuple of root data (cf. [Bor79], [Spr81]) 


of the group G; here Tis a maximal torus of G (over a separable closure of the 
ground field Kk’); X*(T) is the group of characters of T; X,(T) the group of 
one parameter subgroups of T and A (resp. AY) is a basis of the root system 
(resp. the dual basis of the system of coroots). The connected component 
of the Langlands L-group “G° is defined to be the complex reductive group 
obtained by inversion Wp +> WJ, whose root data is isomorphic to the inverse 


wo(G)Y = (X,(T), AY, X*(T), A). (6.5.9) 


If G is a simple group, then the group “G(C) can be characterized upto a 
central isogeny by one of the types An, Bn, ..., Go of the Cartan—Killing 
classification. It is known that the map wo +> wy interchanges the types 
B, and C,,, and leaves all other types fixed. Thus if G = Sp,, (respectively 
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GSp,,), then “G° = SOon41(C) (resp. “G° = Sping,,,;(C)). The whole group 
“G is then defined as the semi-direct product of “G° with the Galois group 
Gal(k*/) of an extension K* of the ground field K over which G splits (i.e. 
its maximal torus T becomes isomorphic to GL}). This semi-direct product 
is determined by the action of the Galois group x = Gal(K*/K) on the set 
of maximal tori defined over K°. 

The most important classification result of the Langlands theory states 


that if 
T= &) m € 2(G/K) 


then for almost all v the local component 7, corresponds to a unique conjugacy 
class of an element h, in the group /G. 
Let us consider the Euler product 


L(a,r, s) = II L(ty,1, 8), (6.5.10) 
ves 


where S is a finite set of places of K, 
L(ty, 7, 8) = det(lm — Nu~*r(hy))~*. 


Langlands has shown that if 7 € 2(G/K) then the product in (6.5.10) 
converges absolutely for all s with sufficiently large real part Re(s) (cf. [L71a]). 
The product (6.5.10) defines an automorphic L-function only up to a finite 
number of Euler factors. Although this is sufficient for certain questions related 
to analytic continuation of these functions, the precise form of these missing 
factors is very important in the study of the functional equations. A list of 
standard conjectures on the analytic properties of the L-functions (6.5.10) can 
be found in A.Borel’s paper [Bor79] 

We refer to recent introductory texts to the theory of automorphic L- 
functions and the Langlands program: [BCSGKK3], [Bum97], [Iw97], 

For the group G = GL, and the standard representation r = r, = St: 
4G° = GL,(C) the main analytic properties of the L-functions (6.5.10) are 
proved in [JPShS], [GPShR87], [Sh88], [JSh] (see also [Bum97]|, [BCSGKK3], 
[CoPSh94]). 

Also in the case G = GL, the multiplicity one theorem (an analogue of 
the theorem of Atkin—Lehner) (cf. [AL70], [Mi89], [Li75]) has been extended 
(cf. [Gel75], [Gel76]). This is closely related to the non-vanishing theorem: for 
a cuspidal representation 7 one has L(z,rn,1) 40. 

For GL3 an analogue of Weil’s inverse theorem (see §6.3.8) has been proved: 
if all the L-functions of type L(7@x, 13,8) (where y is a Hecke character and 7 
is an irreducible admissible representation) can be holomorphically continued 
to the entire complex plane, then the representation 7 can be realized in the 
space of cusp forms ([CoPSh94], [JPShS]). More recent results on the case of 
GLn, cf. [CoPSh02]. 
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Interesting classes of L-functions attached to Siegel modular forms were 
introduced and studied in [An74], [An79a], [AK78]. These modular forms and 
their L-functions have deep arithmetical significance and are closely related 
to the classical problem on the number of representations of a positive defi- 
nite integral quadratic form by a given integral quadratic form (as generating 
functions, or theta-series). These numbers arise in Siegel’s general formula 
considered above (5.3.71). From the point of view of the theory of automor- 
phic representations, Siegel modular forms correspond to automorphic forms 
on the symplectic group G = GSp,,. In this case the dual Langlands group 
coincides with the universal covering Sping,,,,(C) of the orthogonal group 
SOon+1(C). To construct L-functions one uses the following two kinds of rep- 
resentation of the L-group ’G = Sping,,,, % Gal(K*/K): pon41 and rn, where 
P2n+1 is the standard representation of the orthogonal group, and r,, is the 
spinor representation of dimension 2”. It is convenient to consider the follow- 
ing matrix realization of the orthogonal group: 


SOan4i(C) = {9 € SLonsi(C) | 'gGng = Gn}, 


with a quadratic form defined by the matrix 


ae ea) 1) Oss 0 
eae dylan Se 0 1.--- 0 
Gn = ie | a aa eee 0 
Ditters 0. i. ery 


If t = yt € A(GSp,,/K) then for almost all v the representation 7, corre- 
sponds to a conjugacy class h, in “G whose image in the standard represen- 
tation is given by a diagonal matrix of the type 


pan+i(hy) = {ia “8 5 On,v» Oa ) ee 1h 
and in the spinor representation it becomes 
Tr(hy) = {80,1 Bo,vO1,0,°** »Bo,v Cir v%ig,v *** Limyvs** +}; 

where for every m < n all possible products of the type 

Boi, vGinu Up, v, Li, <ig<-++<t, Sn 
arise. 

The element h, is uniquely defined upto the action of the Weyl group W,, 

generated by the substitutions 

Bow > Bovis, Ai ore Oj O45 (#4) 


and by all possible substitutions of the coordinates 


41,05 Mig,v, °° in,v- 


338 6 Zeta Functions and Modular Forms 


A.N.Andrianov has established meromorphic continuations and functional 
equations for automorphic L-functions of the type L(a;7,7Tn,s) where my is 
the automorphic representation of GSp,,(A) over Q attached to a Siegel mod- 
ular form f with respect to Ij, = Sp,(Z), n = 2 . He has also studied the 
holomorphy properties of these spinor L-functions for various classes of Siegel 
modular forms f, cf. [An74], [An79a] . Analytic properties of such functions 
are related to versions of the theory of new forms in the Siegel modular case 
for n = 2, cf. [AP2000]. A.N. Andrianov and V.L.Kalinin in [AK78] have stud- 
ied the analytic properties of the standard L-functions L(mf, p2n+1, 8), where 
my is the automorphic representation of GSp,,(A) over Q attached to a Siegel 
modular form f with respect to the congruence subgroup Ij'(.N) C Sp,,(Z). 
For n = 1 these L-functions coincide with the symmetric squares of Hecke 
series, previously studied by Shimura. 

A general doubling method giving explicit constructions of many automor- 
phic L-functions, was developed in [Boe85] and [GPShR87]. 


Further analytic properties of automorphic D-functions 


We refer to Sarnak’s plenary lecture [Sar98] to ICI-1998, and to the related 
papers [IwSa99], [KS99], [KS99a], [LRS99], [KiSha99]. 

In [IwSa99], four fundamental conjectures were discussed: (A) Grand 
Riemann hypothesis; (B) Subconvexity problem; (C) Generalized Ramanu- 
jan conjecture; (D) Birch and Swinnerton-Dyer conjecture. Another problem 
which is related to (D) is a special value problem. Namely, the question as to 
whether an L-function vanishes at a special point on the critical line. 

From the classical point of view, analytic and arithmetic properties of new 
classes of automorphic L-functions where studied in new Shimura’s books 
[Shi2000], [Shi04], using a developed theory of Eisenstein series on reductive 
groups. 


6.5.3 The Langlands Functoriality Principle 


(cf. [Bor79], [BoCa79], [Gel75], [Pan84] , and for recent developments, [Lau02], 
[Hen01], [Car2000], [Li2000], [BCSGKK3], [CKPShSh]). This important prin- 
ciple establishes ties between automorphic representations of different reduc- 
tive groups H and G. A homomorphism of the L-groups u : “H —> 'G 
attached to G and H is called an L-homomorphism if the restriction of u to 
 77°(C) is a complex analytic homomorphism to “G°(C), and u induces the 
identity map on the Galois group Gx. The functoriality principle is formu- 
lated in terms of the conjugacy classes of the matrices h, corresponding to 
the local components 7, of an irreducible admissible representation 7 = @,Ty 
of the group H(Ax). It includes the following statements: 


1) locally: for almost all v there exists an irreducible admissible representation 
Uux(Ty) of the group G, = G(K,) which corresponds to the conjugacy class 
of the element u(h,) in “G; 
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2) globally: there exists an irreducible admissible representation u.(7) = 7’ = 
ym, € U(G/K) such that 7}, = u.(7y) for almost all v. In this situation 
the representation 7’ is also called the lifting of the representation 7. 


In particular, according to this principle every automorphic L-function of 
the type L(z,r,s) where r : “G — GL,,(C), must coincide with the L-function 
L(rs(1), 17m, 8) of the general linear group GL,, with the standard representa- 
tion rm of the L-group *G° — GL,,(C). These automorphic L-functions are 
called standard L-functions, and as was noted above their analytic properties 
have to a certain extent already been studied. 

Liftings of automorphic forms can be studied using the Selberg trace for- 
mula ([BoCa79], [Sel89, Sel89], [Arth83], [ArC189]). This powerful tool es- 
tablishes a connection between characters of irreducible representations and 
conjugacy classes, generalizing the classical result for finite groups. 

The functoriality principle for automorphic forms is closely related to the 
problem of parametrizing the set of equivalence classes of irreducible admissi- 
ble representations over global and local fields by means of representations of 
the Galois group (or more precisely by means of homomorphisms from the Weil 
group of the ground field (6.2.6) to the L-group “G, regarded as a group over 
C in the local case, or as a group over all completions E of some number field 
E in the global case). It is conjectured that to an admissible homomorphism 
of that type must correspond a non-empty set, referred to as an L-packet, of 
classes of irreducible admissible representations of the group G(K,) or G(Ax) 
(this is the Langlands conjecture). In this correspondence the Z—function of 
a representation of the Weil group (6.2.6) is identified with the Z-function 
of the associated automorphic (irreducible, admissible) representation of the 
reductive group. 

In the case G = GL, this conjecture is the essential content of class field 
theory (both local and global) establishing a correspondence between char- 
acters of the group Gal(K/K) and automorphic forms on GL1, which are 
characters of the idele class group (in the global case) or characters of the 
multiplicative group (in the local case). 

The task of passing from GL, to other reductive groups is a vast non- 
commutative generalization of class field theory. We have considered above 
special cases of this correspondence attached to classical modular forms, the 
group GL» and two-dimensional Galois representations (both complex and 
l-adic). These examples seem to be a promising start to a theory, which is in- 
tended to tie together algebraic varieties (motives), Galois representations and 
automorphic forms (automorphic representations). An excellent introduction 
to the Langlands program is contained in [BCSGKK3] and [CKM04]. 


6.5.4 Automorphic Forms and Langlands Conjectures 


For some recent developments in automorphic forms and applications we also 


refer to [Laff02], [Lau02], [Hen01], [Car2000], [Ha98], [Li2000]. A significant 
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progress in the area of automorphic forms and their applications has been 
made in the last decade. 

A fundamantal problem of number theory is to classify representations 
of the Galois group Gal(F*/F) where F’* is a separable closure of a global 
field F, and a fundamental problem of group theory is to give the spectral 
decomposition of the space L?(G(F)\G(Ar)) of automorphic forms over a 
reductive group G over F’. 

We only mention the work on the local Langlands conjecture (for G = 
GL,,() over a local field Kk), cf. [Hen01], [Car2000], [Ha98], and [LRS93], 
where the general case in positive characteristic was treated. 


Also, we only mention Lafforgue’s work on the Langlands conjecture in 
positive characteristic, cf. [Lau02], [Laff02], [L02], where the Langlands cor- 
respondence was established for G = GL, with arbitrary r over a function 
global field F = F,(X) of characteristic p > 0 where X is a smooth projec- 
tive curve over F,. For a geometric version of the Langlands correspondence 
we refer to [BCSGKK3], and to [Ngo2000], containing a proof of a Frenkel- 
Gaitsgory-Kazhdan-Vilonen conjecture for general linear groups. 


7 


Fermat’s Last Theorem and Families of 
Modular Forms 


7.1 The Shimura—Taniyama—Weil Conjecture and Higher 
Reciprocity Laws 


7.1.1 Problem of Pierre de Fermat (1601-1665) 


This chapter is based on the courses of lectures given by the second author in 
the Ecole Normale Supérieure de Lyon (February-May 2001), in the Moscow 
State University (May 2001), and in the Institut Fourier (October-December 
2001). Wiles’ proof of Fermat’s Last Theorem and of the Shimura—Taniyama-— 
Weil Conjecture provides a magnificent example of a synthesis of different 
ideas and theories from previous Chapters, such as algebraic number theory, 
ring theory, algebraic geometry, the theory of elliptic curves, representation 
theory, Iwasawa theory, and deformation theory of Galois representations. 

Pierre de Fermat (1601-1665) raised his most famous problem (c.1637) in 
the margin of a translation of Diophantus’ “Arithmetic” (see also [He97] and 
[KKS2000]): 


Cubum autem in duos cubos, aut quadratoquadratum in duos 
quadratoquadratos, et generaliter nullam in infinitum ultra quadra- 
tum potestatem in duos ejusdem nominis fas est dividere: cujus rei 
demonstrationem mirabilem sane detexi. Hanc marginis exiguitas non 
caperet 


(It is impossible to separate a cube into two cubes, or a fourth 
power into two fourth powers, or in general, any other power higher 
than the second, into two like powers. I have discovered a truly mar- 
velous proof of this, which this margin is too narrow to contain). 


In the modern language “Fermat’s Last Theorem” says that 


for n>2 {: ee => ryz =0 (FLT(n)) 
r,y,zEZ 
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According to Ram Murty in [Mur99], “FLT deserves a special place in the 
history of civilization. Because of its simplicity, it has tantalized amateurs 
and professionals alike .... It is as if some supermind planned it all and over 
centuries had been developing diverse streams of thought only to have them 
fuse in a spectacular synthesis to resolve FLT. No single brain can claim 
expertise in all of the ideas that have gone into this “marvelous proof”. In 
the age of specialization, where each one of us knows “more and more things 
about less and less”, it is vital for us to have an overview of the masterpiece 
such as the one provided by this book.” 


The following is a summary of early progress on FLT. 


Case n = 4 (Fermat himself in a letter to Huygens); 

Case n = 3 (Euler in 1753); 

Case n = 5 (Dirichlet, Legendre, c.1825); 

Case n = 7 (G.Lamé, 1839; n = 14 was already done by Dirichlet in 1832); 

The “first case”, FLT;(p) for all primes p for which g = 2p + 1 is also prime 

(Sophie Germain in a letter to Gauss in 1820): 

\" ToS Sh = tan (FLTi(p)). 

z,y,z€EZ 


7.1.2 G.Lamé’s Mistake 


On March 1, 1847, a French mathematician G.Lamé informed the Academy 
of Sciences in Paris that he had found a complete proof of FLT based on the 
identity 


xP + yP = (at y)(a+Cy)-...- (c+ CP ty), C= G = exp(2ni/p), p F 2, 


assuming the uniqueness of factorization in the ring Z[d,]. 

Immediately J.Liouville reacted by saying: “N’y a-t-il pas la une lacune a 
remplir?” (“Isn’t there a gap to be filled?”), and indeed few months later 
A.Cauchy discovered non-unique factorizations in Z[¢o3]. 


E.Kummer’s Work 


E. Kummer in 1847 introduced the notion of a regular prime p, which in mod- 
ern language is: 


for every ideal I C Z[¢,| (7 P principal => I principal). 
He proved FLT(p) for all regular primes p (for which he was awarded the 


Golden Medal of the Academy of Sciences in Paris in 1850). The smallest 
irregular prime is p = 37, see [BS85], [He97]. 
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7.1.3 A short overview of Wiles’ Marvelous Proof 


(see also [Ste95], [RubSil94], [Da95]). In lectures at the Newton Institute in 
June 1993, Andrew Wiles announced a proof of a large part of the Shimura-— 
Taniyama—Weil conjecture (STW) and, as a consequence, Fermat’s Last The- 
orem (using an earlier result of K.Ribet [Ri]). A final corrected version of this 
proof, completed together with R. Taylor, has appeared in [Wi] and [Ta-Wij]. 
In this truly marvelous proof, a traditional argument of reductio ad absurdum 
is presented in the following form: if a? + b? = c?, abc £ 0, for a prime p > 5 (a 
primitive solution (a, b,c)), then one shows the existence of a non-zero holo- 
morphic function f = fa», : H — C on the Poincaré upper half-plane H, 
defined by a certain Fourier series with the first coefficient equal to 1. It turns 
out that this function has too many symmetries, forcing f = 0, a contradiction 
with its construction. 


In the discussion below, the following group of symmetries plays an im- 
portant role: 


Ih(N) = {(° 5) € SL2(Z) | + =0 mod v} C SLs(Z). 
Y 

This group is called the congruence subgroup of level N and the corresponding 

compact Riemann surface 


X(N) =Io(N)\H, where H := HUQU ico, 


is called a modular curve of level N. A function f : H — C is called a modular 
form of weight 2 and level N if the differential wp = f(z)dz descends to a 
holomorphic differential on Xo(N), that is, wy is holomorphic and invariant 
under the action of I(N): 


vo= (“Ele miny, £(SES) = 24010), 


and f(x) = 0 for all x € QU ico. We write Sg(N) & 21(Xo(N)) for the C- 
vector subspace of such forms; it has dimension dim S2(N) = genus(Xo(N)). 


The main point is to show that f = fa,v,< is a (non-trivial) modular form 
of weight 2 and level 2: f € S2(2). However, Xo(2) = S (the Riemann sphere) 
has no non-trivial differentials, a contradiction. 


To prove the existence of a non-zero function f with these properties one 
starts with the Frey—Hellegouarch curve 


E = Eorpr,ce 1 y” = P3(x), where P3(x)=2(x—a?)(x+b?), (7.1.1) 


(assuming without loss of generality that a? = —1 mod 4 and 6 is even). One 
observes first of all that the discriminant of the cubic polynomial P3 is equal 


344 7 Fermat’s Last Theorem and Families of Modular Forms 


to (abc)? # 0. Hence the projectivization E of the affine curve with the 
equation (7.1.1) is smooth of genus 1, and has a rational point at infinity. Let 
us consider the generating series: for any prime | define 


N(E£) = # {(z,y) € F7 | y? = P3(z) mod I}, (7.1.2) 
by = b(E) = 1— N(E), 
(-te> eo (7.1.3) 
n>1 
1 
with » ban = II <s 4 [1—2s’ 
asa igs 1—bl-*s +1 


for a finite set S' of primes containing all prime divisors of the discriminant of 
P3. In particular, in the definition (7.1.3) one has 6j=1. 

One sees that the series g converges and hence defines a holomorphic func- 
tion on the complex Poincaré upper half-plane H, and that 


P3(x) 


N(E) =1+ SS (=) ? (5) being the Legendre symbol. 


a mod l 
Now one proceeds to show that: 


— Modularity: if E is a semistable elliptic curve then the generating series 
9 = Ge is modular — this is the main ingredient of the proof; 

— Controlling the level: there exists a modular form f of an appropriate mini- 
mal level No = N(E,p) such that the Fourier coefficients of f are congru- 
ent to those of g modulo an appropriate prime ideal ),. It turns out that 
for the Frey—Hellegouarch curve one has No = 2. 


Remark 7.1. According to the theorem of Faltings, the modularity of the series 
g = 9x, is equivalent to the existence of a modular parametrization 


pn: Xo(N) > E, 


since £ is isogenous to a factor of the Jacobian Jo(N) coming from the choice 
of a cusp eigenform given by the generating series (7.1.3) (see §6.4.3). 


7.1.4 The STW Conjecture 


Conjecture 7.2 (Shimura-Taniyama—Weil). For any elliptic curve E over Q 
there exist a finite set S of primes and an N = N(E,S) € N such that the 
generating series g = gz,s given by (7.1.3) is a modular form of weight 2 and 
level N. 

If the set S of exceptional primes is minimal, then N is the minimal con- 
ductor of E. 
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Example 7.8. Let 


1 
Eiyt+y=2 a yaa — a’ +7, S= {11}. 


Then the generating series is given by 


g=4 | [ Q—-a™)?(1-g'™)? = q—2¢? Gg? +297-+q° +29°—29" +--- € S2(11) 
m>1 


1}2 }3 J5)7 J11 [13 |17)19}23)29)31)- - - |10007 


Ni 4 |4 |4}9 | - | 9 |19}19}24/29]24)--- | 9989 
bj |-2 J-1 J1 J-2 | - | 4 J-2}0}-1)0]7]---] 18 


7.1.5 A connection with the Quadratic Reciprocity Law 


(cf. [Da99]). Conjecture 7.2 may be viewed as a far reaching generalization 
of the Quadratic Reciprocity Law of Gauss. Indeed, let us consider for any 
dé Z, 


N,(d) =# {x € F, | 2? =d mod 1} 


5 2, de (FS)’, 
hence N;(d) =1+ (5) = 41, d=O0modl, 
0, d g ( 1)? 


By quadratic reciprocity, Nj;(d) depends only on / mod 4|d|, and the generating 
series 


-1 
Gy baer wth: ban = [| (1 - (5) ae (7.1.4) 
n>1 n>1 l l 


belongs in fact to the finite dimensional complex vector space consisting of all 
formal Fourier series with coefficients d,, periodic modulo 4|d| (as functions of 
n). The generating series (7.1.4) is in fact attached to the Galois representation 


? = py 1 Gg > {41} =GLi(Z), (py4(n) = (<)), 


n 
and therefore has the same nature as the series (7.1.3), also coming from a 
Galois representation (attached to F). 

7.1.6 A complete proof of the STW conjecture 


The STW conjecture was proved in full generality in 1999 by Ch.Breuil, 
B.Conrad, F.Diamond and R.Taylor (see [Da99]). In 1994 A.Wiles proved 
this conjecture for the important subset of semistable elliptic curves. This was 


346 7 Fermat’s Last Theorem and Families of Modular Forms 


sufficient to deduce FLT from STW, since if one assumes the existence of a 
Frey—Hellegouarch curve, then such a curve would necessarily be semistable. 
We shall explain the notion of semistability below. Following a theorem of 
K.Ribet (1986) (cf. [Ri]), conjectured by G.Frey in autumn 1984, a Frey— 
Hellegouarch curve can not be modular, since its generating series would have 
too many symmetries. Therefore the semistable STW implies FLT by the 
non-modularity of the Frey—Hellegouarch curves. 


Note that the full STW is necessary, for example, in order to prove that 
a? +b? = 2? —> abc =0 for p> 5. 


One proves this result using a curve analogous to the Frey—Hellegouarch curve; 
however, this analogous curve is no longer semistable. 


Controlling the level: the existence of a modular form f of minimal level 
No = N(p, E) attached to an F and p, is given as a consequence of the follow- 
ing theorem of Mazur-Ribet (which is only briefly discussed in this chapter, 
but see a detailed exposition in [Ri], [Edx95]). 

The theorem of Mazur-Ribet is formulated in terms of the Artin conductor 
N(po) of a Galois representation 


po : Gal(Q/Q) > GL2(F,), 


and in terms of the formal power series 


9 = Goo = >, dng” € F lial], 


n>1 


which may be attached to any such representation. Assuming that the series g 
is the g-expansion (modulo a prime) of a modular form of weight 2 and some 
level N, we say that pp is modular of weight 2 and level N. 

Assuming this (and some other conditions, including the irreducibility of 
po, see Theorem 7.4) the theorem of Mazur-Ribet states the existence of a 
modular form of the minimal level N(po) congruent to the series g € F,|[q]]- 
In particular, this nice result is applicable to the Galois representation 


Po = Pp.b : Gal(Q/Q) > GLa(F,), 


given by the Galois action on the points of order p of EF. The notation 
No = N(p, E) = N(po) is used for N(pp9). Assuming the modularity of F, 
one deduces the modularity of the Galois representation 9, and this implies 
the existence of a modular form of minimal level N(po) with this property. It 
turns out that for the Frey—Hellegouarch curve we have N(p, EF) = N(po) = 2, 
and this is sufficient to deduce FLT from STW. 


We shall give here only some formulations and brief comments on results 
about modular forms of minimal level: 
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Theorem 7.4 (Mazur-Ribet). Let p > 3 be a prime. Let po : Gal(Q/Q) — 
GL2(F,) be an irreducible Galois representation, which is modular of weight 
2 and of a square-free level N. 

If po is finite at p, then it is modular of weight 2 and of level N(po), where 
N(po) is the Artin conductor of po (see (6.4.22)). If po is not finite at p, then 


it is modular of weight 2 and of level pN(po). 


Remark 7.5. 1) The condition “po is finite at p” is a local condition concerning 
the restriction po|p, to the decomposition group at p. This condition 
means that po|p,, comes from a finite flat group scheme over Zp (cf. Tate’s 
paper in [CSS95]); 

2) By a general property of the Galois representation 


po : Gal(Q/Q) — GLa(F,), 


attached to a cusp form of level N, it follows that N(po) always divides 
N (see §6.4.2 and [Edx95]). 


The proof of Theorem 7.4 has two parts. The first part, due to Mazur, 
deals with primes / dividing N/N(po) that are not congruent to 1 modulo p. 
The second part, due to Ribet, deals with an arbitrary | 4 p at the cost of 
introducing a prime q in the level that could be then removed by the first 
part. 


Theorem 7.6 (Mazur). Let p > 3 be a prime. Let po : Gal(Q/Q) — 
GL2(F,) be an irreducible Galois representation, which is modular with weight 
2 and some level N. Suppose that | is a prime not congruent to 1 mod p, that 
L divides N but I? does not, that po is unramified atl if 1 p and that p is 
finite at p if l= p. 

Then po is modular of weight 2 and level N/I. 


Theorem 7.7 (Ribet). Let p > 3 be a prime. Let po : Gal(Q/Q) > GL2(F,) 
be an irreducible Galois representation, which is modular of weight 2 and some 
level N. Suppose that 1 ¢ p is a prime, that 1 divides N but I? does not, that 
po is unramified at 1 if 1 p. 

Then there exists a prime number q not dividing N and congruent to 
—1 mod p, such that po is modular of weight 2 and level qN/l. 


Corollary 7.8. Let E be a semistable elliptic curve. Assume that the gener- 
ating series 


9 = 92,5 = >~ bag” € Alla] 


n>1 
is modular (i.e. g € So(N) for some N). Then the conditions of Theorem 7.4 


are satisfied for the Galois representation 


Bp.2 : Gal(Q/Q) + GLa(Fp). 
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Hence there exists a modular form of weight 2 and minimal level No, 


f = 5 anq” € Opllall 


n>1 
with coefficients in the ring of algebraic integers Og C QCC such that 
f € S2(No) and Vi ¢ S,a; = b; mod A, 
for some maximal ideal A, C Og containing p (in particular, f #0). 


Corollary 7.9. If p > 5, then the Frey—Hellegouarch curve E = Egp ppc : 
y? = a(x —a?)(x+ bP) with a? =—1 mod 4 and b even, can not be modular. 


In fact, using Tate’s curve (see §6.3.3) one can easily calculate the Artin 
conductor in this case: it turns out that No = 2. On the other hand $2(2) is 
zero because S2(2)  2'(Xo(2)) = 0. Hence by theorem 7.4, f is identically 
zero, which is absurd since its first coefficient is 1. 


7.1.7 Modularity of semistable elliptic curves 


The main purpose of this chapter is to explain Wiles’ proof of the modularity 
of all semistable elliptic curves over Q. 


Definition 7.10. a) An elliptic curve E over Q is said to be semistable at a 
prime | if one can choose its equation in the form ®(x,y) = 0 in such a 
way that 


P(x, y) € Zi [x,y] and (7.1.5) 


the singular points of the reduction (7.1.6) 
@(x,y) € Fi[z, y] are simple _ 


(that is, the quadratic part of (x,y) at any singular point (xo, yo) is non- 
degenerate; recall that a singular point (xo, yo) is a point such that: 


B(x0, yo) = By (x0, yo) = F, (x0, yo) = 0. 
y 


b) An elliptic curve E over Q is called semistable if it is semistable at all 
primes l. 


Remark 7.11. The definition 7.10 is entirely geometric. However one can give 
a purely algebraic definition 7.18 of the notion of semistability using Tate’s 
uniformization, see §6.3.3: an elliptic curve E is semistable if and only if the 
representations p, _ on the Tate modules of E satisfy the condition: 


1 
Vp Wl # P,  Pp,u(Ii) is conjugate to a subgroup of cae 


where I; C Gg = Gal(Q/Q) denotes the inertia subgroup. 
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Example 7.12. a) Consider the Frey-Hellegouarch curve E given by y? = 
x(a — a?)(a +b”), where p > 5 is prime. We shall assume that b is even 
and a? = —1 mod 4. After the substitution x = 4X, y = 8Y +4X we 
obtain the following equation 

bP — qP —1 Ppp 
VARY SIO RA ee 
4 16 
The point (X,Y) = (0,0) is in fact the only singular point mod 2, and 
here the quadratic part is given by 


Y?7 4+ XY, if 8 divides a? + 1 
X?4XY+Y?, if 8 does not divide a? + 1. 


Thus in either case E is semistable at | = 2. 
Now suppose | 4 2, a? + b? = c? and l|abc. The equation reduces to one of 
the form y? = x?(a2 — a) mod | with a # 0. The only singular point here 
is (x,y) = (0,0), and the quadratic part y? — ax? is non-degenerate. 

b) The modular elliptic curve Xo(15) : y? = a(a + 3°)(a — 4?) is semistable, 
whereas the curve E : y? = x(x — 37)(x + 4”) is not semistable. 


Now the main result of the chapter says: 


Theorem 7.13 (semistable STW Conjecture, A.Wiles (1994)). Every 
semistable elliptic curve is modular. 


Corollary 7.14. There exist no Frey-Hellegouarch curves E = Ep pp ,cr, and 
hence FLT(p) is true for any prime p> 5. 

7.1.8 Structure of the proof of theorem 7.13 (Semistable STW 
Conjecture) 


I. Modularity modulo p (with p = 3,5) 


Let E : y? = P3(x) be a semistable elliptic curve. We may assume that 
P3(x) € Z[a], and we let g = ges = d0,315ng” € Zl[q]] be its generating 
series, = 


P. 
b =l—- N(E) =—- S- ( a) 3 (=) is the Legendre symbol. 


a mod Ll 


One constructs a modular form h = S- eng” € Ogllal] with c = bh mod Ay, for 
n>1 

all! gS, the finite set of exceptional primes S, where \,, C Og is a prime ideal 

containing p. This problem was solved only for p = 3 (the Tunnell-Langlands- 

Serre Theorem) under the assumption of absolute irreducibility modulo 8 of 

the representation 
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P3,n : Gal(Q/Q) > GLo(F3) 


(on the points of order 3 of £). This condition is not always satisfied, and this 
was considered by A.Wiles as a significant difficulty in his proof. In May 1993 
A.Wiles found a way to overcome this difficulty; he found a way of switching 
the problem from p= 3 to p= 5. He used a family FE; of elliptic curves with 
the property that all the representations ; ,, are isomorphic, and such that 
there exists a curve E’ = E;, in this family with an irreducible representation 
P3,n. This made it possible to replace E by E” in the above argument, see 
87.7. 


II. Modular lifting 


Any series h = S35, énqg”" € O[fq]] with coefficients in a finite extension 
O of Z, (with maximal ideal ), satisfying certain necessary conditions of 
modularity, and having the property that: 


VWEéS,q = mod A 


is automatically a modular form (which is called a lifting of h mod 4, or an 
admissible deformation of h). One gives these necessary conditions in terms of 
the absolute irreducibility of the Galois representations attached to modular 
forms (these conditions are used also in Theorem 7.4 of Ribet). In other words, 
one shows that under these conditions any admissible lifting of h mod 4 is in 
fact modular. 


III. Absolute irreducibility 


One shows that the absolute irreducibility conditions are satisfied for the 
Galois representations 


Dp.z : Gal(Q/Q) > GLa(F,) 


either for p = 3 or for p= 5. 


IV. End of the proof: passage from p = 3 to p= 5 


Let us consider the series h = g (the generating series of our representation). 
By I) this series is an admissible deformation of a modular form h mod 3 
(under the conditions of absolute irreducibility for p = 3), and II) implies 
that in this case g must be a modular form. In other cases, III) says that the 
irreducibility condition is satisfied for p = 5. Moreover, II) implies that g’ = 
ge,g is a modular form. By the construction of E’, we know that g’ mod 5 = 
g mod 5 € Fs|[q]] again satisfies the conditions II) for modular lifting (with 
p =5). Hence g is also modular in this case. 
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Fermat’s Last Theorem 


| 


Semistable Taniyama-Shimura 
Conjecture 


Semistable Taniyama-Shimura Semistable Modular 
for pp,3 irreducible Lifting for p= 5 


Langlands-Tunnell Semistable Modular 
Theorem Lifting for p= 3 


Semistable Modular Lifting Conjecture = Fermat's Last Theorem . 


Fig. 7.1. 
The main stages of Wiles’ Proof are presented in the Figure 7.1, which is reproduced 
here from [RubSil94], p. 6, with a kind permission of K.Rubin, A.Silverberg and 
AMS. 


In the next section we describe the Langlands-Tunnell Theorem. 


7.2 Theorem of Langlands-Tunnell and 
Modularity Modulo 3 


7.2.1 Galois representations: preparation 


Recall that the Galois group Gal(Q/Q) of Q ¢ C contains for any prime | and 
for any maximal ideal A C Og over | the decomposition subgroup D) and its 
normal inertia subgroup [; = Iy/1, 


Go = Gal(G/Q) (7.2.1) 
U 

Di = Dy = {9 € Gal(Q/Q) | g(A) = A} = Gal(Q,/Q) 
V 


h=hpr=tg¢e Di | Vee Og, 9(2) = x(modA)} & Gal(Q,/Q}"), 
D,/T = Gal(F,/F)) = (Frob)), Frob;(z) = a, Frob,; € Gal(F,/F1) 


where Q/" denotes the maximal unramified extension of the [-adic field Q;. 
The subgroups D; = D),; are all conjugate due to the transitivity of the 


action of Gal(Q/Q) on the maximal ideals \ above 1. This means that for any 
1, the Frobenius element Frob, lifts to a conjugacy class of cosets of J; in Gg. 


Definition 7.15. a) Let A be a topological ring over Z,. We define a Galois 
representation over A to be a continuous homomorphism 


p: Gg > GL, (A). 


b) p is said to be unramified at | if p(Ii) = {1}; in this case the trace 
tr(p(Frob;)) € A is well defined. 
c) p is said to be reducible if there exists a matrix C € GL,(A) such that 


V9 € Gg, Clp(g)C € aes ) \ 


Arithmetical examples. 


a) The Cyclotomic character. Let A = Zp,n =1, p = Xp: Ga — ZF the 
character of the group Gg defined by the action on the roots of unity of 
degree p": lim jipr = Zp. One has xp(Frob;) =1 € ZF for any 1 F p. 


; 
b) For an elliptic curve E over Q one has E(C) = C/(wi,w2) (Weierstrass’ 
theorem). Hence for any positive integer m the group of m-torsion points 


E[m] := Ker(ut> mu) & (Z/mZ)? 


of EF is a Gg-module: fp, 7 : Gag > GL2(Z/mZ). Putting m = p” and 
passing to the limit 
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lim E[p"] = Z?, 
r: 
one obtains a Galois representation 
Pp,E : Go — GL2(Z,p) 


with the properties: 


det pp,z = Xp (7.2.2 
Pp,£ is unramified at all primes 1 { pAg, 


and for these primes trpp,e(Frob;) = 1+1—#E(F)), 


w 
we a 


where £ denotes the reduction E mod | (which is good in this case due 
to the Néron—Ogg-Shafarevich criterium, see §5.4.1 ). 


Theorem 7.16 (Langlands—Tunnell—Serre). Let E be an elliptic curve 
and let 


Dp. + Gal(Q/Q) > GLo(Fp) 
be the representation on the points of order p of E. Assume that p3 ~ 1s ir- 
reducible. Then there exists a cusp form h = S- Cng”, Cn € Og of weight 2 
n>1 
and a maximal ideal 43 C O5 above 3, such that for all l outside a finite set 
S= S(E£), 7 
cq =l+i1- #E(F;) (mod 3). 


Proof is explained in §7.2. It makes use of the commutative diagram 


W:GLe(F3) 2° GLo(Z[/—g]) C GLa(C) 
aoe | mod (1+./—2) 
GLo(F3) 


where W denotes the two-dimensional complex irreducible representation of 
GL2(F3) given by 


wv ST = eh wv ber... 1 -1 
-10/° \-10/” in i oe V-2 -1+/-2/)° 
Remark 7.17. Note that the homomorphism W is a section of the natural ho- 
momorphism 


GL2(Z([v—2]) > GLa(Fs), 
induced by the reduction modulo (1 + /—2): 
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7.2.2 Modularity modulo p 


The modularity of an arbitrary irreducible Galois representation modulo 3, 
given by the proof of Theorem 7.16, is a special case of a general conjecture 
of Serre [Se87] (see §6.4.6): 

Let p be a prime number, p a prime ideal of the ring of all algebraic integers 
O C Q dividing p (i.e. p € p). We call a representation 


p: Gg > GL2(F,) (7.2.4) 


a modular representation of type (N,k,x), if there is a modular form (see 
86.3.5) 


h(z) = ys Cne(nz) € Sp(N, x)(e(z) = exp(27iz)), 


which is an eigenform of all the Hecke operators, normalized by c; = 1, such 
that for all primes | { Np the representation p is unramified at / and we have 


Tr p(Frob;) = cq mod p. (7.2.5) 


Serre conjectured that every irreducible representation (7.2.4) is modular 
for some N not divisible by p. He also described explicitly the numbers N 
and k and the character y, assuming that N and k are minimal subject to 
the condition (N,p) = 1. According to Serre’s conjecture, the number N is 
determined by the ramification of p outside p in the same way as the Artin 


conductor: 
N = N(p) = [[. 
lAp 
The weight k is given by ramification properties of p at p, and the character 
xy: (Z/N)* > Q* can be obtained from the determinant of p as follows: 


det p(Frob;) = x(J)I*-! mod p (I+ Np). 


It was noticed by Serre [Se87] that one can easily deduce this conjecture for 
all representations into GL2(F3) (p = 3), from a general result of Langlands— 
Tunnell, cf. [L80], [Tun81], [Gel95], and §6.5. The Langlands—Tunnell The- 
orem states that every two-dimensional odd complex Galois representation 
p : Gg — GL2(C) with solvable image is modular. More precisely, let 
Jp = >2-1 bng” be the generating series of p, i.e. the series whose coeffi- 
cients are those of the Artin L-series of p (see §6.2.2, 6.4.5) 


oS 1 
bn = =L 95), 
b= gaara = 9 


where GL2(C) = GL(V), V = C? and J; is the inertia group at /. Then the 
Langlands—Tunnell Theorem states that g, is a weight one modular form, and 
is a cusp form if p is irreducible (see §6.4.5). 
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To explain how this result implies a proof of Theorem 7.16, we shall con- 
sider the complex representation p = W o p3 ~. The image of p is certainly 
solvable, as it is isomorphic to a subgroup of the solvable group GL2(F3). 


7.2.3 Passage from cusp forms of weight one to cusp forms of 
weight two 


Next, we shall construct a cusp form h = >>, cng” of weight two starting 
from a cusp form g of weight one given by the Langlands—Tunnell Theorem. 
For this purpose one uses Eisenstein series of weight & and Dirichlet character 
x (generalizing the series of §6.3.1): 


B CO 
Fy(z) = — aE +S° So x(dd*te(nz) € Mk (N, x), (7.2.6) 
n=1 d\n 
>) CO 
By5(2) = d)d*-e(nz) € Mg(N, x), (7.2.7) 
n=1 a 


where k > 1, and Bz, is the ith generalized Bernoulli number (or Bernoulli- 
Leopoldt number), defined by the equality 


2 Best™ = a y(a)te™ 


kh eNt_] 
k>0 a=1 


For k = 1, 2, one requires for convergence that y is non-trivial. The important 
property of these numbers is that 
B 
Liha) 2 =. 


k 


d 
In particular, if N = 3, k = 1, x(d) = x3(d) = (5) is an odd Dirichlet 


character and we have By, = —}. Thus 
Bays ( j=140- > (8 ) etme) € MAL, 
n=1 d|n 


In order to finish the proof of Theorem 7.16 we construct a cusp form h = 
end1 6nd", Cn € Oglla]] as the following product: 


h= gpF i x3 = oy bnqg” 1+ 6S° Se (5) gq’ | =: ys Cng”, Cn € OG. 


n>1 n=1 din n>1 


Then h is a cusp form of weight 2 with the desired properties: we take S' to 
be the set of all primes dividing 3N, where N is the level of g; then for any 
L¢é S, 7 

cq = 6b) =1+1—-#E(F;) (mod p) 
for any maximal ideal p C Og containing 1+ V—2. 
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7.2.4 Preliminary review of the stages of the proof of Theorem 
7.13 on modularity 


I) Any elliptic curve F over Q is “modular modulo 3”, if its Galois represen- 
taion ps3 p is irreducible. 


II) The result of I) gives only a necessary condition for modularity for p = 3 
(together with properties (7.2.2) and (7.2.3)). One can use these condi- 
tions as a starting point for proving modularity in the semistable case. 
We shall reformulate the modularity statement as an assertion about an 
isomorphism of certain “deformation rings”. 


More precisely, let O D Z, be the ring of integers of a finite extension 
K D> Q,. Then O is a discrete valuation ring (DVR), and we shall write A for 
its maximal ideal. 

Consider a cusp form h = 7,5, eng" € Og|lql] (for example, the one just 
constructed), and choose a maximal ideal A, C Og such that 


Og/dp & Fp D O/AD FE, 


Then the meaning of the stage II) is to show that if we take any formal power 
series 


h=S ing, G& €O 
n>1 
over the local ring O, satisfying certain necessary conditions for modularity 
and irreducibility, as well as the congruences 


ci = ¢; mod A, for all | outside a finite set S, 


then the series h must be modular, i.e. it represents the Fourier expansion of a 
modular form. Recall that we have fixed an embedding ip : QQ, DOD A 
with the property 


in (A) C Ap C OG CQ, 
so we may regard h as the image under i, of a series in Ogllall- 


Remark 7.18 (Algebraic meaning of the semistability of EF). Using Tate’s uni- 
formization, see §6.3.3 one can show that an elliptic curve FE is semistable 
if and only if the representations pp,z on the Tate modules of F satisfy the 
condition: 


1 
Yp,VlAp, Pp,2(11) is conjugate to a subgroup of & ) , 


where I; C Gg = Gal(Q/Q) denotes the inertia subgroup. 


7.3 Modularity of Galois representations and Universal 
Deformation Rings 


7.3.1 Galois Representations over local Noetherian algebras 


The generating series h to be constructed at stage II), may be interpreted 
next as a generating series of a Galois representation. In this section we treat 
the problem of modularity of certain Galois representations 


p: Gg GLim(A), 


with coefficients in a local O-algebra A with maximal ideal my. As before in 
section 7.2, O > Z, denotes the ring of integers of a finite extension K D Q,. 
Thus O is a discrete valuation ring (DVR), and \ denotes the maximal ideal 
of O. We always assume that: 


A/mm, 2 O/A=kD Fy. 


Definition 7.19. Let € = Co be the category of local noetherian O-algebras 
equipped with an augmentation 7 : A — O: its objects are given by 


C=Co = {(A,7) | 7: A> O surjective}, 


and its morphisms given by commuting triangles: 


Example 7.20. a) A= O=Z,; 

b) A= Ol ag? Xn], mA = Cee Grae Xn), ta(f) _ flai,-* 1 Qn) for 
some fixed aj; 

9) 


A= Dpl[X]/(X(X — p")) © {(a,8) € Z2 | a= b mod p"}, 
ma = {(a,b) € pZe | a=bmod p"}, ma(a,b) =a. 


7.3.2 Deformations of Galois Representations 


Definition 7.21. a) We fiz a representation pp : Gg — GLi,(k) over the 
finite field k as above. Then a lift p of po to A is a representation p: Gag — 
GL,,(A) such that p mod my = po. 

b) Two lifts p : Gg — GLm(A) and p! : Gg — GLm(A) are called strictly 
equivalent if there exists a matrit C € GL,(A), such that C = In mod ma, 
and 


for all g € Ga, p'(g) = C*p(g)C. 


c) A deformation of po in A is a strict equivalence class of lifts of po to A. 
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Let p : Gg — GL2(A) be (a representative in the class of) a deformation 
of po : Gg > GLe2(k): 
GL2(A) 
Zo | 
Go —>> GLa(k) 


For a fixed representation po : Gg — GLm(k) over the finite field k let us 
denote by 
S = {l prime | po(I1) 4 Im}. 


In order to deal with finite sets of deformations of po, we shall also fix the 
type D = Dy of our deformations, where 57 denotes a finite set of primes such 
that UNS = 9. 

The main discovery in Wiles’ marvelous proof is a method for counting 
two different types of objects: 


1) Galois representations coming from elliptic curves over Q of a given type; 


2) primitive cusp eigenforms of weight 2 and given level N, and with Fourier 
coefficients in Q. 


Let us fix a two-dimensional representation po : Gg — GLo(k) over the finite 
field k, and sets S and as above. 


Definition 7.22. a) A deformation p of po in A is said to be of type D = Dy 

if the following conditions i)-iv) are satisfied: 

i) p is unramified outside of set SU XU {p}; 

i) det p = Xp: Ga > ZF — A*, the cyclotomic character; 

iii) for alll ES with LF p, pr, ~ G 1) (semistability at 1); 

iv) the restriction p|p, satisfies a certain local condition (it is “good”). 
This means that p|p, is either “flat” or “ordinary”. 
“Flat” means that for any ideala C A of finite index, the reduced 
representation pmod a: Gg — GL,,(A/a) comes from a finite flat 
group scheme over Zp. 
“Ordinary” means that 


p|D, © . 2 ( with an unramified character x). 


b) A deformation p of po is said to be admissible, if it is of type Dy for some 
finite set X’. 


For finite flat group schemes, we refer to [Ta95]. 

Our goal is to show that any admissible deformation p of a modular repre- 
sentation po : Gg — GLo2(k) is also modular (under some absolute irreducibil- 
ity conditions on po). 
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Remark 7.23. Definition 7.22 implies that any deformation p of type Dy fac- 
torizes through the projection Gg — Gs, = Gal(Qx,/Q), where Qs, is the 
maximal algebraic extension of Q unramified outside Ys = SU DU {p}. 


7.3.3 Modular Galois representations 


We have already encountered modular Galois representations over finite fields 
and local fields in §6.1 and §7.2. We now describe a more general notion of a 
modular Galois representation p : Gg — GL2(A) over a ring A. Rather than 
fixing a modular form of type (N,k, x) 


h(z) = S Cne(nz) € Sk(N, x), 


which is an eigenform of the Hecke operators, normalized by c; = 1, such that 
Tr p(Frob;) is expressed in terms of ¢; for all primes | { Np, we shall instead use 
a homomorphism z : T’/(N) — A from an appropriate Hecke Z-algebra such 
that Tr p(Frob;) is expressed in terms of 7(T;) for all “good” Hecke operators 
T; (indexed by the primes | { Np, see §6.3.6). 

Recall that there is an isomorphism (see §6.3.1) 


To(N)/Li(N) & (Z/NZ)*, oa = @ 7 mod I(N) ++ d mod N. 


Definition 7.24. (a) The “diamond” operator (d) on 


Me(N) =Ma(Ti(N)) = EQ Ma (N, a) 


~w mod N 


is defined by 


(df = fon = (ce as (SEE), 


In particular, for any f © My(Ii(N)) we have: 
f EMe(N, p) => Vd € (Z/NZ)*, (d) f = vA. 


b) The Hecke operators T; (see (6.8.32)) are defined for all 
f(z) = rg ane(nz) in Mi(N) by 


Tif =Uif+ Vf), 


where 
Uif = Weg aine(nz) 
ViI)(F) = Epo an((l) (f))e(Inz) 
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c) The Hecke algebra T'(N) over Z is defined by 
T'(N) = 2Z[Ti, (d) | lf N,d € (Z/NZ)”]. 


Definition 7.25. A Galois representation over a ring A is said modular of 
level N, if there exists a ring homomorphism 7 : T'(N) > A such that for all 
primes l{ N 


tr p(Frob;) = 7(T)) 
det p(Frob;) = m((1))l*-?. 


(see [Ste95] and §6.4.1). Following Hecke and Petersson (see §6.3.6), the action 
of T’(N) on the complex vector space S;(N) can be orthogonally diagonalized. 
Suppose {f} is an orthogonal basis and each f(z) = 729 ane(nz) € S;.(N) 
is primitive. Then a, = 1 and Tf = a, f with a, © O5: As before we fix an 
embedding i, : Qo Q,, and consider a finite extension K of Q, containing 
all the i,(a;) with 1 { Np. Let O > Z, be the ring of integers of kK, and let A 
denote its maximal ideal. 

In the following theorem, we identify Q with its image i,(Q) C Q). Thus 


we identify the elements a € i; '(O) C Q with i,(a,), omitting the symbol ip. 


Theorem 7.26 (Eichler-Shimura—Deligne). For any prime p, and for 
any primitive cusp eigenform f(z) = S37 ane(nz) € Sp(N,x) there ex- 
ists a modular representation p = pyz,, with coefficients in A = O such that 
tT: Ti ay. 


Idea of the construction. Assume for simplicity that k = 2, x is trivial and 
a, € Z(n > 1). Let us consider the holomorphic differential wr = f(z)dz, 
then O = Z,, and \ = pZ,. We consider next the lattice of periods (see §5.3.5 
and §6.3.2) 


Ag = ( fos | y is a closed path on Xo(W)) cC. 
‘i 


It turns out that E = Ey = C/A, is then an elliptic curve defined over Q. We 
define the Galois representation 


Pf.p = Pf. = Pp,B * Ga > GL2(Zp). 
According to the congruence relation of Eichler-Shimura, 
tr(pp,2(Frob;)) = a) =1+1—#E(F)), det(pp,2(Frobi)) = 1 for all lt pN, 


where EF denotes the reduction E mod | (which is good in this case by the 
criterion of Néron—Ogg-Shafarevich, [Se68a]). In fact E is isogenous to a factor 
of Jacobian Jo(V) coming from the cusp form f given by (7.1.3) (see §6.4.3). 
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When Q((an)n>1) # Q, but k = 2 and x is trivial, one obtains py, via 
an Abelian variety with real multiplication (cf. [Shi71]). When k > 2 or x is 
nontrivial, the construction is more complicated (cf. [Del68]), but these cases 
are not used in Wiles’ proof. 

In order to define a homomorphism z : T’(N) — A, we put 7(X) = Af(X) 
for any X € T’(N), where f|X = A-(X)f, and A;(T1) = a € O C A. This 
gives a homomorphism z : T’(N) = A. 


7.3.4 Admissible Deformations and Modular Deformations 


Consider again a local O-algebra A with maximal ideal m4, where O D Zp 
denotes (as in §7.2) the ring of integers of a finite extension K D Q,. Thus 
O is a discrete valuation ring (DVR), and \ denotes its maximal ideal. We 
assume always that 

A/ma = O/A=k DF», 
and let us fix a two-dimensional representation pp : Gg — GLe(k) over the 
finite field k, and sets S' and »' as above. 


Let p : Gg — GL2(A) denote (a representative in the class of) a deforma- 
tion of po : Gg — GLa(k) of type Dy : 


GL2(A) 


ae 
Ge —> GLa(k) 


Definition 7.27. Let DAy(A) denote the set of all admissible deformations 
of po of type Dy and DMy(A) the subset of DAy(A) consisting of all modular 
deformations of po of type Dy. 


We shall see that the set DAy(A) is finite (in fact, At» DAy(A) is a 
functor with values in finite sets, see [Da95]). The main theorem of Wiles’ 
proof says that under suitable conditions on po, both sets coincide for any A 
as above. 

Consider the subgroup of index 2 


G = CG 
o(vicn ep) 
corresponding to the unique quadratic extension of Q unramified outside p. 


Theorem 7.28 (Modularity of admissible deformations). Suppose that 
po : Gg > GLo(k) is a modular representation over a finite field k, and that 
the restriction 


pola = 
a(Vicn 2") 
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is absolutely irreducible. 
Then 


DAy(A) = DMy(A). (7.3.1) 
That is, every admissible deformation is modular. 


Remark 7.29. i) Under the assumptions of Theorem 7.28 both representations 
po and p are semistable at all places] € S. 


ii) The strong condition of absolute irreducibility of the restriction of po im- 
plies (trivially) that po itself is absolutely irreducible. This fact is important 
in Theorem 7.4 of Mazur—Ribet, which we now restate in the following form: 


Theorem 7.30 (of K.Ribet on the existence of minimal deforma- 
tions). Suppose that po : Gg — GLo(k) is a modular and absolutely irre- 
ducible representation over a finite field k. Then the set DM (A) of minimal 
deformations of po is non-empty (that is, there exists a modular deformation 
p of po of minimal level N(po), where N(po) is the Artin conductor of po). 


Example 7.31 (The Frey—Hellegowarch curve). Consider again E = Egp pe jee 
with a? = —1 mod 4, 2|b, p > 5, and let po = Pp, ~ : Gg > GLo(F,). Then 


i) N(po) = 2 (see [Se87], p. 201). 

ii) For Frey—Hellegouarch curves, we have |E[2](Q)| = 4, since the points of 
order 2 correspond to the roots the cubic polynomial x(a# — a?)(ax + bP). 
Using this fact we may show that po is irreducible for p > 5 by Mazur’s 
theorem [Maz77]| and 1.3.7: the only possibilities for the torsion subgroup 
of E(Q) are (up to isomorphism): 


Z/nZ (1<n< 10, and n= 12), 
Z/2Z x Z/2nZ (1<n <4). 


Suppose the Gg-module V = E[p] has a Gg-invariant line W over F,. 
If W is fixed pointwise by the action of Gg, then E(Q) has a torsion 
subgroup isomorphic to Z/2Z x Z/2pZ, which would directly contradict 
Mazur’s theorem. If on the other hand W is invariant under Gg only as 
a line over F,, then the quotient curve E’ = E'/W, isogenuous to E, can 
also be defined over Q. One can show that EF’ has a rational point of order 
p and three points of order 2, again contradicting Mazur’s Theorem (see 
[He97]). 

iii) Assuming the modularity of p9 (modulo p), one can simply apply Ribet’s 
Theorem 7.30 to deduce that E = Egp,pp,-p does not exist. However, it 
seems that we can only prove the modularity of po by proving the mod- 
ularity of pp,z; this is why we require Theorem 7.28 (on modularity of 
admissible deformations). 
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7.3.5 Universal Deformation Rings 


We need to show that under the notations and assumptions of Theorem 7.28 


one has 
DAy(A) = DMs(A), 


for any local Noetherian O-algebra A with maximal ideal m4, and for any 
two-dimensional representation po : Gg — GL2(k) as above (A/my4 = O/A = 
Kh y 
Reformulation of the identity (7.3.1): 
We shall use the representability of the two functors 

DAy > DMy: Co > Sets fin 


(for any finite set X’). This means that there exist “universal” objects (called 
universal deformation rings) Rs,Ts € Co such that for any A € Co one has 


DAs(A) = Home,(Rs, A) D DMy(A) = Home, (Ty, A). 
In particular, substituting A = Ty we obtain a canonical morphism 
ys: Ry - Ty, (7.3.2) 
and the canonical universal pairs 
(Ry, pi”), and (Ty, px) 
are related by the commutative diagram: 


GL2(Rs) (7.3.3) 


a 


Go ps 


univ.mod. 
, < 


GL2(Tys) 


In order to count the sets DAy(A) and DMy(A), a clever choice of the 
augmentations is used for the local algebras Ry and Ty: 


Try 1 Ry - O 
Tr, : TsO 


Remark 7.82. (J.-P. Serre) The universal deformation rings Ry and Ty are 
topologically generated by the elements tr(p¥’"’(Frob;)) € Ry for all primes 


co 
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This fact means also that both universal representations 
pi” :Gg>Gla(Rz), permet : Gq + GLa(Tx) 


are determined by their traces (see [Da95]). 
We may therefore define the augmentation maps by the following relations: 
Try 1 tr pi" (Frob)) + a; (7.3.4) 


Ty, 1 tr pyte-™e4- (Frob;) ay 


where a; € O are the Fourier coefficients of Ribet’s cusp eigenform 


f(2i= S- ane(nz) € Sx(No), 


n=0 


of minimal level (whose Galois representation py, corresponds to a deforma- 
tion p € DMg(O)). 
We shall denote by 


P= pf,» : Go > GL2(O) 


the corresponding modular Galois representation of the minimal level No. 
Using the map (7.3.2), one can interpret Theorem 7.28 as an isomorphism 
of the local Noetherian O-algebras Ry and Ty: 


Theorem 7.33 (Main Theorem on isomorphism of universal defor- 
mation rings). Under the assumptions of Theorem 7.28, the canonical mor- 
phism (7.3.2) 

ys: Rs Ts 


of universal deformation rings is an isomorphism in the category Co. 


7.4 Wiles’ Main Theorem and Isomorphism Criteria for 
Local Rings 


7.4.1 Strategy of the proof of the Main Theorem 7.33 


Let us consider again a local O-algebra A with maximal ideal m4, where 
O > Z, denotes (as in section 7.2) the ring of integers of a finite extension 
K D> Q;,; O is a dicrete valuation ring (DVR), and \ denotes the maximal 
ideal of O. We always assume that 


A/m, 2 O/A=kDFp, 


and we fix a two-dimensional representation po : Gg — GL2(k) over the finite 
field k, together with sets S and »' as described above. 
Recall that Ribet’s modular Galois representation 


P= pf, : Go > GLa(O) 


of minimal level No given by Theorem 7.30 belongs to the (non-empty) set 
DM,(Q). This gives a distinguished element of each of the sets DMy(A) C 
DAs(A). This representation f is used in an explicit construction of the mod- 
ular universal deformation ring Ts, see [CSS95]. 


Surjectivity of the map ys : Ry — Ty (7.3.2) can be easily deduced from the 
fact (see 7.32) that the universal deformation rings Ry and Ty are topologi- 
cally generated by the elements tr(p¥’"’(Frob;)) € Ry for primes | ¢ X's, see 
below 87.4.2. 


Injectivity of ps : Ry — Ty was proved by A. Wiles by an induction argument 
on 5. For a prime / not in X's, we let ©’ = SU {I}. Wiles deduced the 
bijectivity of yx from the bijectivity of yx using an isomorphism criterion 
for local rings. This criterion was formulated in terms of certain invariants 
(discovered by Wiles earlier, in spring 1991, see the introduction of his paper 
[Wi|). However, in order to start the induction one needed the case ' = 0) 
(the base of induction). This was the point which caused a problem in 1993, 
after the announcement of a complete proof of FLT, and which was repaired 
in 1994 by A.Wiles and R.Taylor using a horizontal version of Iwasawa theory 
together with a second isomorphism criterion for local rings. In this section 
we describe these criteria and give explicit constructions (due to H.Lenstra 
and B.Mazur) of the universal deformation ring Ry. 


7.4.2 Surjectivity of py» 


In order to prove the surjectivity, we assume the existence of the universal 
deformation rings Ry, Ts € Co. Thus for any A € Co we have 


DAy(A) = Home, (Rs, A) ) DMs(A) = Home, (Ty, A), 
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implying the existence of a canonical morphism (7.3.2) 
Qs: Ry —= Ty. 


Lemma 7.34. Let A = Ry (resp. A = Ts), and denote by A° subring 
of A which is the topological closure of the O-subalgebra in A generated by 
all elements tr(p’ (Frob;)) € Ry (resp. tr(p¥’-™°4 (Frob;)) € Ts). Then 
AD =A, 


This lemma can be deduced from the following: 


Proposition 7.35. Let A° Cc A be two local rings with maximal ideals satis- 
ying 
mao =m,4Nn AY 
and with the same finite residue field k. Suppose 
p:G— GL, (A) 
is a representation of a group over A such that 


1) P=pmod mg is absolutely irreducible; 
2) trp(c) € A® for alla EG. 


Then p is conjugate over A to a representation 
p?:G— GLn(A°) 


Proof of Proposition 7.35. 

Let B denote the A°-subalgebra in M,,,(A) generated by p(G). The image 
of B in M,,(k) is a central simple algebra over the finite field k. It follows from 
the triviality of the Brauer group (see §5.5.5) of the finite fieldk that the image 
of B in M,,(k) is the whole of M,,(k). Let e1,-+- ,@;2 be elements of B whose 
reductions modulo m, form the standard basis of M,,(k) = B mod my. We 
shall show that e€1,--- ,€m2 is a basis for B over A°. By Nakayama’s lemma 
elements of B may be expressed in the form: 


m2 


b= S- a;e;, with a; € A. 
i=l 


Hence 
r(b- ‘e;) -5 a,xtr(e , with j =1,--. ,m?. (7.4.1) 


Let us define 
Cj = tr(e;‘e;) € A= (ci;) = Im2 mod ma. 


Hence the system (7.4.1) is solvable over the local ring A°. One defines V C 
A™ to be the submodule generated by the columns of elements in B. Thus V = 
(A°Y” is free, and we deduce that B — End(V) & M,,,(A°) by Nakayama’s 


lemma. 
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7.4.3 Constructions of the universal deformation ring Ry» 


We assume that po is absolutely irreducible. 

To prove the existence of Rs; one can either appeal to a general criterion 
of Schlessinger (cf. Mazur’s paper in [CSS95]), or instead use a more explicit 
method of H.Lenstra (cf. the paper of Bart de Smit and H.W.Lenstra in 
[CSS95]). 

Consider first a finite group G, and let us define an O-algebra O[G,m] 
with generators: 
and the following relations: 

m 
h a 
X= buy XE =S0 X49 xi i,j=l,---mg,heG 
l=1 
As these relations mimic the relations satisfied by matrix coefficients of a 
representation of G, it follows that for any A € Co there is a canonical iden- 
tification 


Homo _alg(O[G, m], A) = Hom(G, GLn(A)). (7.4.2) 


Substituting A = O/A = k in the above formula, we obtain a homomorphism 
mT of O-algebras corresponding to po: 
Homg-aig(O[G, m],k) = Hom(G, GLim(k)) 

U WU 

70 = Po- 
Let mo = Ker 7; we define the O-algebra Rg to be the completion of O[G, m] 
with respect to mo: 

Rg = lim O[G,m]/mg. 


n 


Now suppose we have a profinite group: 
Gy os lim G;. 
Then we put 
R;=Re,;, Ry =lim R;. 
It may be verified that 
a) 
Hom,,(G, GL, (A)) = Home-—aig(Rs, A). (7.4.3) 


b) Ry is a local Noetherian O-algebra (to show this, one uses a universal 
bound for the dimension of the tangent space of R;, and the absolute 
irreducibility of po). 
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7.4.4 A sketch of a construction of the universal modular 
deformation ring Ty» 


Let us again fix a two-dimensional modular representation pp : Gg — GLa(k) 
over the finite field k, together with sets S and »' as above. 

We shall consider a slightly different Hecke algebra than in 7.3, Definition 
7.24, namely, 


T(E) = O[Ti, Uy, (d) | Lt Nu,d € (Z/NZ)*,q € SUS]. 
We shall regard T(Ns) as a subalgebra of Endo S2(Nxz,Q). In the above we 
have 7 
Ny =p|[a]]?. 
qeS tex 

Furthermore S2(Nx, 0) denotes the O-submodule of O|[q]], generated by all 
formal q—-expansions of the form 

> in(an)a” € Ollall 

n> 1 


such that i 
f= So ang” € S2(Nz;Q) 


n>1 


is a cusp form with coefficients an € QUi;1(0). 
Let 


denote Ribet’s modular form of Theorem 7.30, attached to a two-dimensional 
modular representation po : Gg — GL2(k) over the finite field k. 
Recall that Ribet’s modular Galois representation 


P= pyr: Go > GL2(O) 


of minimal level No given by Theorem 7.30 belongs to the (non-empty) set 
DMg(Q). For any » as above, we define 


fo =o anlfz)a 


n>1 
by removing from the Mellin transform of a the Euler factors at 1 € Z: 


L(fx,8) = So an(fr)n (7.4.4) 


n>1 


= [[c = agg *)* II (1 - al-* +4 pert 


ges UNs 
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Now consider the following ideal of the Hecke algebra: 
My = (A, Ti — G1, Ug — 44; Tiigsusutp}, aes.iex: (7.4.5) 


This ideal is actually prime, since 


Ms, = Ker(T(2)3k|[q]J), (7.4.6) 
Ti — a; mod 4, 
Ug ag mod 4, 
Tj 0 
(lg DUSU{p},GE Sle 5), 


and the ring k[[q]] is an integral domain. 
We define Ts to be the completion of T(’) with respect to the ideal My: 


Ts =limT(3)/M3. 


n 


One can check that Ty is a finite flat local Noetherian O-algebra (i.e. it is a 
free O-module of finite rank), and one defines an augmentation map Ts — O 


using f. 
Theorem 7.36. There exists, up to isomorphism, a unique admissible Galois 
representation 

pimivmod. - Go > GL2(Ts), (7.4.7) 
with the following properties: 


tr( pureed: (Frob;)) = Ti, (7.4.8) 
det (p¥’'°4 (Frob;)) = Il g SUSU {p}). 


The construction by A.Wiles of the universal representation p¥f"’-"°? was 
obtained from the Eichler-Shimura Theorem 7.26 by patching together all 
the modular deformations of type Dy. To achieve this he used of the the- 
ory of pseudo-representations. The strong absolute irreducibility condition of 
theorem 7.28, concerning the restriction 


Pola / poi > 
of V(-1) 2— °) 
was essential in this construction. 


7.4.5 Universality and the Chebotarev density theorem 


Let us recall Theorem 4.22 in the following form: 
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Theorem 7.37 (Chebotarev density theorem). Let L/K be a finite ex- 
tension of number fields, and let X be a non-empty subset of G(L/K), invari- 
ant under conjugation. Denote by Px the set of places v € X°., unramified 
in L, such that the classes of Frobenius elements of these places belong to X: 
Fry jx(Px) C X. Then the set Px is infinite and has a density, which is equal 
to Card X/Card G(L/K). 


Corollary 7.38. The canonical morphism (7.3.2) is compatible with the aug- 
mentation maps Tr, and Try 


pst hs- Ty (7.4.9) 


In fact, the traces of representations mr, and 77, ° yy coincide on the 
subset of elements Frob;(/ ¢ X's) (which is dense in the group Gy,). It fol- 
lows that the corresponding universal deformations are equivalent, hence they 
coincide by their universal property. 


7.4.6 Isomorphism Criteria for local rings 
To prove that the canonical morphism (7.3.2) 
Py: Rs P Ty 


of universal deformation rings is an isomorphism (in the category Co), one 
argues by induction on ¥’. Let ©” = YU {1} for some prime / not in 2’g. Wiles 
deduced the bijectivity of ys from the bijectivity of yy using an isomorphism 
criterion for local rings. This criterion is formulated in terms of certain invari- 
ants, which will be described next. In order to start the induction, one needs 
to prove the case 3’ = ); this is achieved by a second isomorphism criterion 
for local rings. 


Definition 7.39. A local Noetherian O-algebra A is called a complete inter- 
section if: 


a) A is a free O-module of finite rank; 
b) AS OAs, So. »Xn]]/(fi, a. sda) 


(cf. [Mats70]). 
We shall use the following invariants of a local O-algebra A: 


Ig =Kertma, ©4=I14/I4, na =ta(Annl,) CO (7.4.10) 


These are called respectively the kernel of augmentation, the tangent space 
and the congruence module. 


Example 7.40. a) A= O=Z,, 4 = I4/I4 = {0} 
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b) A = Z,[[X, Y]]/(X(X — p), Y(Y — p)), Ga = Z/pZ x Z/pZ, na = (p*). 
The augmentation map in this case is given by 
twa(f) = f(0,0) € Z,, A is a complete intersecton ring. 
The ring A is a complete intersection. 
c) A=Z,[[X, Y]]/(X(X —p), Y(Y —p), XY), Ga = Z/pZ x Z/pZ, na = (p). 
The augmentation is given by 
Ta(f) = f(0,0) € Zp. 
In this case A is not a complete intersection. 


Theorem 7.41 (Criterion I). Let y: A — B be a surjective morphism in 
the category Co. Then the following are equivalent: 


(i) y is an isomorphism of two local complete intersection O-algebras; 

(tt) #®y < #O/nB < ~w; 

(iii) #O4 = #O/np < . 

Remark 7.42. In the first version of his proof A.Wiles had made the assump- 
tion that the ring B is Gorenstein (i.e. B = Hom(B,(O) is a free B-module 
of rank 1). This restriction was later removed by H.Lenstra. 


Corollary 7.43. An O-algebra A € Co is a complete intersection ring if and 
only if 
#0, = #O0/na < OO. 


This is proved by applying Criterion I to the identity map id, : A > A. 
7.4.7 J—structures and the second criterion of isomorphism of 
local rings 
Let us consider the distinguished ideals 

Im = (Wm(S1),+++ 5W¥m(Sn)) C O[[S1,--+ ,Sn]], 
where 
Wm(S1) = (1+ 51)" —1,  wm(Sn) = (1+ Sp)?" —1, Jo = (S1,-++ Sn). 


Definition 7.44. Let py: A > B be a surjective morphism in Co. One says 
that y admits a J-—structure, if there is a family of commutative diagrams, 
indexed byme€eN: 


O[LS, oe 5 Sn] 


om 


Em Pm 


<— P< 
es 
3 
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with the following properties for each m: 


i) Em is surjective; 
it) Ym is surjective; 
iv) Bn/JImBm is a torsion free module of finite rank over the O-algebra 


O[[S1, i Sn] /Im- 


Theorem 7.45 (Criterion IT). 

Let p: A— B be a surjective morphism in the category Co. 

If p admits a J-structure then ~ is an isomorphism of two local complete 
intersection O-algebras. 


Proof of both criteria belong to commutative algebra. We refer therefore 
the reader to [CSS95], [Ta-Wij]. 


7.5 Wiles’ Induction Step: Application of the Criteria 
and Galois Cohomology 


7.5.1 Wiles’ induction step in the proof of 
Main Theorem 7.33 


In §7.4.3-7.4.4 we explained the existence of the universal deformation rings 
Rs,Ts © Co. These universal rings represent the functors of admissible defor- 
mations (respectively modular deformations). This means that for any A € Co 
one has 


DAs(A) = Home,(Rys, A) D DMy(A) = Home, (Ty, A). 
In particular, substituting of A = Ts we obtain a canonical morphism (7.3.2): 
ys: Rs Ts, 
and the canonical universal pairs 
(Re, pi”), and (Tr, py) 


are related by the commutative diagrams : 


Ry = Ty GL2(Rs) 
TRS TTS Go gs 
a 
GL2(Ty) 


(see §7.4.5). 
Let us recall the Main Theorem 7.33 in the following form: 


Theorem 7.46 (Main Theorem). Suppose that po : Gg — GLo(k) is a 
modular representation over a finite field k, such that the restriction 


pola / pi 
Qi V(-1) 2 p 


is absolutely irreducible. 
Then the canonical morphism (7.3.2) 


ys: Ry Ts 


of universal deformation rings is an isomorphism of complete intersection 
rings (in the category Co). 
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Remark 7.47. i) We have already seen in §7.4.3 that yy is surjective. 

ii) Recall that the proof of injectivity of py : Rx — Ty was given by A. Wiles 
using induction on ¥’. We shall denote ©” = XU {I} for a prime / not in 
dig. Wiles deduced the bijectivity of ys from the bijectivity of yy using 
Criterion I (see §7.4.6) for isomorphisms of local rings. This criterion is 
formulated in terms of the invariants (7.4.10). To begin the induction one 
needs the case ©’ = J). This case is proved using Criterion II of §7.4.6 for 
isomorphisms of local rings. 


The Induction Step 
Let us consider 
Y=VU{h, D=Dy, D=Dy, A=Ry, B=Ty. 
We shall assume the induction hypothesis, i.e. 
ps: Ry Ty. 

This implies the equality of the corresponding invariants: 

#Op, = #O/nT, < ~. (7.5.1) 
According to Criterion I, it suffices to prove the inequality 

#OrR,, < #O/m,, <0, (7.5.2) 


given the equality (7.5.1). The left hand side of this inequality is controlled 
by a Galois cohomology group. The right hand side will be computed using 
a determinant representing a relative invariant 15,5, which relates #O/nr,, 
and #O/nr,. A fundamental inequality relating these quantities will imply 
the induction step. 


Base of induction: the minimal case 


We shall construct in the next section §7.6.3 a J-structure for the surjective 
morphism yg. This will show by Criterion II (see §7.4.6) that both rings Ry 
and Tyg are isomorphic complete intersection rings. 


7.5.2 A formula relating #®r, and #rR,,: preparation 


We explain below in §7.5.5 a formula relating #@r,, and #®pr,, using Galois 
cohomology groups with coefficients certain Gg@—modules, which we shall now 


describe. Let 
_ fo = Soa Ang” 


n>1 
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denote Ribet’s modular form of Theorem 7.30, attached to a two-dimensional 
modular representation po : Gg — GL2(k) over the finite field k. 
Recall that Ribet’s modular Galois representation 


P= pyr: Go > GL2(O) 


of minimal level No given by Theorem 7.30 belongs to the (non-empty) set 
DM (0). 
Consider the reduction 


pmod A” : Gg > GL2(O/X"). 
We shall use the following finite Gg-modules X)» defined by 
Xyn = Ad? moa an © Ma(O/X") C Ad°(5@ K/O) = Xp. 
The notation Ad will be explained below, cf. §7.5.4 (we identify the O-modules 
O/X" and A~”"/O by choosing a uniformizer). 
7.5.3 The Selmer group and pr, 


In order to compute the invariant #@pr,,, A.Wiles established an isomorphism 
of O—modules 


Homo-moa(@rs, K/O) — Selo, ie H*(Go, X3) 


where 
Selp, = Hy, (Go, X5) 


is a generalized Selmer group. This is a finite O-submodule of the (infinite) O- 
module H'(Gg, X;). The group Selp,, is contained in the (usual) ©-Selmer 
group, which is a finite O-submodule Sely of H'(Gg, Xj), consisting of all 
cohomology classes unramified outside of ©’U SU {p}, compare with (5.3.40): 


Sel (Xz) := {a € H!(Go, Xz) | Vl ¢ B, Res? °x = 0}, (7.5.3) 


where Res/°a denotes the restriction of x to the inertia subgroup Ij. 


7.5.4 Infinitesimal deformations 


Consider a representation p: G— GL,,(A) of a group G. This determines an 
A[G]-module structure on the free A-module M = A”, with the action of G 
given by p. We shall also consider the following A[G]—modules: 


Ad(M)=EndaM, g:2+ p(g)xp(g)7! (7.5.4) 
U 
Ad(M)">° = End~°(M) (an A[G] — submodule). 
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Definition 7.48. 1) An extension of M by M is a short exact sequence of 
A[G]-modules of the form 


(—41 MSR 0: 


2) Consider the ring Ale] generated over A by an element € subject to the 
relation e7 = 0. There is a projection map T-=0 : Ale] + A which takes € 
to 0. By an infinitesimal deformation of a representation p: G — GL,(A), 
we shall mean a representation p’' : G — GL, (Ale]), such that p = te=009": 


GLn(Ale]) 


Te=0 


3) Two infinitesimal deformations p',p" : G — GL,(Ale]) are said to be 
strictly equivalent if there exists a matriz C € 1, +eMn(Ale]) such that 


for all g € G,p"(g) = Cp! (g)C. 


Remark 7.49. The construction of an infinitesimal deformation may be viewed 
as the first step in the construction of any deformation. 


Theorem 7.50. There are canonical bijections between the following three 
sets a), b), c): 


a) HY(G,Adp); 
b) The set Ext'(M,M) of equivalence classes of extensions of M by M; 
c) The set of strict equivalence classes of infinitescimal deformations of p. 


Proof of Theorem 7.50 


Let us consider an extension 


(es 


The module M is a free A-module. Hence we may choose a section ¢: M > E 
of 3, which is morphism of A—modules but not of A[G]-modules. 
This means that for all m € M and g € G we have 


g(g-'m) — ¢(m) € Ker(Z) = Im(a), 


since 


B(g(g~'m) — o(m)) = g8d(g7'm) — Bd(m) = 0. 
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From this we obtain the following 1-cocycle (representing a cohomology class 
in H*(G, Ad(p))): 


T, = (m 1 an} (gh(g7tm) — o(m))) © End4(M) = Adp. 
One rereads the cocycle condition as by saying that 


(9) = (In + €Ty) (9) 


is an infinitesimal deformation. 
Conversely, any cocycle {T,} € H'(G,Adp), defines an extension M @ 4 
Ale] = M @eM with the action given by p’. 


Remark 7.51. If detp’ = detp, then there is the identity 
det (1, + eT, )detp(g) = detp(g) > trT, =0 


and it follows that H!(G, Ad°p) describes the deformations with fixed deter- 
minant. 


7.5.5 Deformations of type Dy 
The deformations of type Dy are defined starting from the restriction maps 
Resp? : H!(Gg, X) — H1(Di,X) 


which are used in the definition of the generalized Selmer groups. 


Examples of computations of Selmer groups 


(comp. with §5.3, §4.5). Given a short exact sequence 0 A B 
C — 0 of three G-modules, there is a long exact sequence (4.5.23) of coho- 
mology groups: 


0 —+H°(A) — H°(B) — H°(C) (7.5.5) 
—H'(A) — H'(B) —--- ,H"(A) = H"(G, A). 


We shall use this sequence to deduce Kummer theory: 


Example 7.52 (Kummer theory and Quadratic Characters). Let K be a field 
containing the group [lm of all roots of unity of degree m in K. Assume further 
that Char K does not divide m. For an arbitrary Galois extension L/K with 
Galois group G = Gal(L/K) the map 7 + 2” defines a homomorphism of 
G-modules: vy : L* —+ L*, and one has the following exact sequence 


=X Pp osx 


= Nee 
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Passing to cohomology groups (4.5.22) we obtain the following long exact 
sequence 


H°(Gk, lm) — H°(Gx,K )—>+H°(Gx,K*) 

BG ie) SEGRE (SIG Ye. 117506) 
Since G acts trivially on fm, it follows that H'(G, 4) coincides with the 
group Hom(G, fm). The group H°(G, L*) is the subgroup of all fixed points 
of the Galois action, i.e. H°(G, LX) = LXG24/5) — K*, Also, H°(G, um) = 
Hm, and H'(G, L*) = {1}, by Hilbert’s theorem 90. We thus have the follow- 


ing exact sequence 


1 Ln K*—K* Hom(G, tim) — 1, 


which is equivalent to the isomorphism of Kummer: 
K*/K*™ © Hom(Gx, [m). 


In particular, letting m = 2 and K = Q, we have 


H" (Gg, {£1}) = Q*/Q*? © Hom(Gg, p12). 


The right hand side here is the (infinite) set of all quadratic characters of Gg. 
Fixing a finite set ©’ of primes, we have 


141) ~ J the finite set of all quadratic characters of Gg 
pele aL = { unramified outside 2’ \ : 


Example 7.53 (The inflation-restriction sequence and Local Tate duality). Let 
HT be an open normal subgroup in G and let A be a G-module. Then one has 
the following “inflation - restriction” exact sequence (comp. with (4.5.24)): 


0— H(G/H, A®)4.H1(G, A) (7.5.7) 


H!(H, A)°/# —, H?(G/H, A®). 
Theorem 7.54 (Local Tate duality). Let X be a finite D; module, where 
D; = Gal(Q;/Q:) C Go, and let n = |X|. We shall also consider the dual 


module 
X* = Hom(X, pn) ((ga*)(x) = ga*(g-*2). 
Then: 
a) The groups H'(D,, X) are finite for alli > 0, and are trivial for i > 3; 


b) There is a non-degenerate pairing 


H'(D,, X) x H?-*(Di, X*) —> H?(Di,Q;) & Q/Z; 
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c) If l{|X| then the subgroups of unramified classes 
H(Di/h,X") and H*(Di/N,(X*)") 


are each other’s annihilators with respect to the above pairing; 
d) The Euler characteristic of X is given by the following explicit formula: 


#H"' (Di, X) = [Ul#X) _ #H"' (Di, X*) 
#H°(Di,X)-#H?(Di,X) — —- #H°(D,, X*) - #H?(Di, X*)’ 


Generalized Selmer groups 
Definition 7.55. Let X be a finite Ga-module and suppose we have a family 
L = {Li} of subgroups L; C H'(D),X), which are finite by Theorem 7.54. 
We shall assume that for | ¢ 3) we have 
L, = Ker(H'(D,, X) —> H'(, X)). (7.5.8) 
The Selmer group attached to a family & = {Ly} is defined by 
Sele (X) = {a € H1(Gg, X) | Vl, Res$?(x) € Li}. (7.5.9) 


Remark 7.56. It follows from (7.5.8) and (7.5.3) that Selg(X) C Sely(X). 


Interpretation of rp, as a generalized Selmer group 
Consider the finite Gg-module Xn = Ad°f mod X", where 
p : Go — GL» (O) 


denotes Ribet’s modular Galois representation of minimal level (see Theorem 
7.30). Let us fix a (ramification) type Dy and consider the tangent space 
Pr, = Ir, / I3 This group was interpreted by Wiles using the generalized 
Selmer group attached to the following family 


Ker(H'(D;, Xyn) — H' (Ij, Xyn)), ifl ¢ L's 
Ly = 4 H*(Di, Xn), ifl=p 
Ker(H1(Dj, Xyn) —> H4(, Xyn/X°)), X°={(fs)}, ifle DUS 
(7.5.10) 


Definition 7.57. With the choice of X = Xn as above, and Ly = {ly} 
given by 7.5.10, we define 


Sely,(X) =Selc (Xx) or Hy, (Go, X) = (7.5.11) 
{x € H!(Go,X) | Vi, Resp°(x) € Li}. 
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Theorem 7.58. There is an isomorphism of finite O-modules: 
Homo—moa(Iry/Ip,,O/X”) = Hb, (Go, Xr»). 
Lemma 7.59. There is an isomorphism of finite O—modules: 


Hom,(mry/(A,mp,.),k) = Hp, (Go, Ad’ po). 


Proof of Lemma 7.59. By Theorem 7.50, the restriction map 
Homo -aig(Rx, kle]) 36 Oma, 7 MRe/(A,MR,) > k 
gives a canonical isomorphism 
Homo ~alg(Ry, kle]) & Hom, (mp, /(A, MR,,), k)- 


Indeed, any such ¢ vanishes on (A, mp.) and is determined by its restriction 
to mr, by the formula 


Ry dae u(x) + ed(a — u(a)) € k+ ek. 


In the above, 
t: Ry > Rs/mrR, > kO Ry 


is the canonical projection and ¢ is the identity on (Rs). By the universal 
property of Rs, the map ¢ is may be identified with an infinitesimal defor- 
mation 


p’ : Gq —> GLo(Kle]) 


of po of type Dy, i.e. with an element of Hy, (Go, Ad’ po). This implies the 
lemma. 


To deduce Theorem 7.58 from Lemma 7.59, one can replace mp, by 
(Irs, A"), where Ip, = Ker(mpr,, : Ry — O), using a version of Nakayama’s 
lemma (we omit the details). 

Now we explain a formula relating #®p, and #@r,,. An explicite formula 
for #®pr,, can be obtained from the cohomological exact sequence of Poitou— 
Tate. 

However, a weaker result suffices for Wiles’ induction, namely: 


#Op,, <#PR,- #H'(L,X)”', where ©’ = Lu {i}, (7.5.12) 
which follows from 
0 — Selp,, — Seln,, — H'(h,X)?', 


and may be deduced from the following form of the inflation-restriction se- 
quence (7.5.7) (we use the fact that X is unramified at 1, i.e. I; acts trivially 
on X): 


CS (Dh (7.5.13) 


Hi (1), X)?! —+ H?(D,/l,X) —> H?(D),X). 
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Induction step: reformulation 
Let us use the induction hypothesis and assume that 
ys: Ry >Ts, 

which implies the equality of the corresponding invariants (7.5.1) 

#Op, =#O/nry < oo. 
According to Criterion I, it suffices to prove the inequality (7.5.2) 

#OrR,, < #O/mr,, < co. 


The left hand side is controlled by a Galois cohomology group: we know by 
(7.5.12) that 


#Op,, <#OR, - HH" (Nh, X)?', where 5” = LU {i}. 


The right hand side will be computed below using a relative invariant nx» 
which relates #O/nr,,, and #O/nr,: 


#0/nry, = #O/nrs - #O/Trs (nNs',z)- 


We explain next the main inequality between these quantities implying 
the induction step. 


7.6 The Relative Invariant, the Main Inequality and The 
Minimal Case 


7.6.1 The Relative invariant 
Recall that we use the following invariants (7.4.10) of a local O-algebra A: 
I, = Kerty, ®4 = Taek 7A = ma(AnnI,) CO 


(the “kernel of augmentation”, resp. the “tangent space”, resp. the “congruence 
module”). 
Under the induction hypothesis there is an isomorphism of complete in- 
tersection O-algebras: 
ys: Ry Ty, 


and this implies the following identity: 
#PR, = #O/MT, < 00. 

According to Criterion I, it suffices to prove the following inequality (7.5.2): 
#Or,, < #O/MM,, < X, 


where 3’ = SU {I}, given (7.5.1). The left hand side is controlled by a Galois 
cohomology group: we know by (7.5.12) that 


#OPr,, < #OPrRs : #H'(N, X)”, where 3” = YU {I}. 


The right hand side will be computed below using a certain relative invariant 
ns',5 Which relates #O/nr,, and #O/nrs: 


#O/tr,, = #O/Mys - #O/Ts(nNy,z). 
We have by definition 
MTs = Wr, (Annly,) C O,nr,, = Wr,,(Annlz,,) CO. (7.6.1) 


Notice that by the universal property of Ts, we have a commutative diagram: 


Tigges OEE eas (7.6.2) 
ae 
O 
From this we obtain 
OD wm, = Irs’ Ts (nz,5), (7.6.3) 


where 
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ns.5 = Ts,5(Annlr,,) 


is called the relative invariant of the morphism 757». 
An explicit computation of the invariant using the construction of the 
universal modular deformation ring T» in 87.4.4 leads to 


UD ies as,» (AnnIr,,) = ( _ NO hs = (1 + 1)?) E Ty. (7.6.4) 


This calculation involve computations with the reduction of modular curves 
(see [Da95]). We omit the details. 
Below we shall explain the main inequality between these quantities: 


#H (Nh, X)?! < #O/n7,,(nsn), where 5” = Yu {i}. (7.6.5) 


This inequality impies the induction step. The use of the relative invariant in 
(7.6.5) 
#O/Trs(Nx",z) 


require the choice of augmentations mr, and 7p, given above by (7.3.4): 
Try : trp” (Frob;) & a 
Ty, 2 trpye’'™4 (Frob;) ay 
where a; € O are the Fourier coefficients of Ribet’s cusp eigenform 
co 
f(z) = S- ane(nz) € Sz(No) 
n=0 
of minimal level (whose Galois representation py,, corresponds to a deforma- 


tion p € DMg(O)). 


7.6.2 The Main Inequality 


Applying the augmentation m7,, to nx’ ,» given above by (7.3.4) to the relative 
invariant 7x,» introduced above, we obtain: 


tr, (nx,5) = Wr, ((l— 1)(TP - (1+ 1)?)) (7.6.6) 
= (l—1)(e? = (1+ 19") € ©. 


Notice that the quantity (7.6.6) does not vanish due to Deligne’s bound: |a;| < 
avi: 
mrs (nse) = (1 1)(az — (1+ 1)”) £0. 


We shall deduce the main inequality (7.6.5) in the following more explicit 
form: 
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Theorem 7.60 (Main Inequality). Consider for any n > 1 the finite Go- 
module X = Ad°p mod X", where 


p : Go x GL2(0) 


denotes Ribet’s modular Galois representation of minimal level (see Theorem 
7.30). Then the following inequality holds: 


#H (Nh, X)?' < #0/(1—1)(a? — (1+1)?), where 5’ = SU{I}. (7.6.7) 
Proof follows from a computation of a determinant. Taking into account that 
H'(Q, X)?! = Homp, (Ni, X). (7.6.8) 


Since the action of I; on X = Ad°f mod X” is trivial for 1+ N(A), it follows 
that 


H'(Q, X)?! = Homp, (Ni, X). (7.6.9) 


Notice that I; = Gal(Q;/Q?"), where Q?” D pp~ is the maximal non— 
ramified extension of Q; (which contains all p-power roots of unity because 


p# il). 


This means that there exists a canonical surjection 


I, — Z,(1) = lim pp» 


(the right hand side here is the Galois group of the maximal Kummer p- 
extension of Q?”). 

Next, notice that the order of the finite module X = Ad°f mod X” is a 
power of p. Hence any homomorphism in (7.6.9) factorizes through Z,(1). We 
therefore have 


Hom p, (Ii, X) = Homp, (Zp(1), X) = X(—1)”! (7.6.10) 


where 


X(-1) = X @ Zp(—1),Zp(-1) = Hom(Zp(1), Zp). 
Since the action of J; on X = Ad°f mod X” is trivial, we now see that 
X(-1)?! = X(-1)?! = X(-1)F"°! = Ker(Frob; — 1)|x(-1).. (7.6.11) 
We may calculate further: 


#Ker(Frob; — 1)|x(-1) = #Coker(Frob; — 1)|x(-1) (7.6.12) 
= #O/det(Frob; — 1)|x(-1) 


and one only needs to verify that that 
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#O/det(Frob; — 1)|x(-1) = #O/(I — 1)(a? — (1+.1)?). 


To prove this we shall calculate explicitely the eigenvalues of Frob; — 1 on 
X(-1). Let a = q; and @ = {3 be the eigenvalues of of Frob; acting on the 
O-module M = O?. Thus we have: 


a+B=aq, av=l 


We shall now determine the eigenvalues of Fob; acting on the following O- 
modules: 


— on M = Hom(M, Q/Z) the eigenvalues of Frob; are {a~', 3-1}; 
— on EndM ~ M ® M the eigenvalues of Frob; are: {1,1,a37', Ba~'}. To 
see this, note that 


10 O 0 
a0 a! 0 {Ol 0 0 
(ole 0 ga) = 00af8-! 0 
00 0 Ba" 


~ on Ad°f c EndM there are only three eigenvalues of Frob;: 


1 @ eee 
{tas rege Fao. 


— on X(—1) there are the following eigenvalues of Frob;: 


Loe @: 
PP RS 


1 
This is because X(—1) = Z,(—1) ® X, and Frob, acts on Z,(—1) as 7 
— finally the eigenvalues of Frob; — 1 on X(—1) are the following: 


1 az GB? 
1 1 1>. 
{ ee es \ 


1 2 2 
det(Froby — Dx» = G WG es 1)= 


(i 1)? — o7)(P — 6?) 


It follows that: 


p5 
(1 — 1)(I* — P(a? + 6?) +0767) | 
U5 - 
C=DP+1 =? +07)) .. 0 -Da@f=C+1?) 
13 13 ; 


This proves the Main inequality (7.6.7), because | € O% is a unit. 
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7.6.3 The Minimal Case 


We keep the notations and assumptions of Main theorem 7.33. To complete 
the proof of Main theorem 7.33, we now treat the minimal case 1 = (, and 
establish the following 


Theorem 7.61. The surjective morphism 
yp: Ry — Ty 


in the category Co is an isomorphism of complete intersection O-algebras. 


Proof of Theorem 7.61. We notice first that the O-algebra Rg admits n (topo- 
logical) generators, where 


n = dim, Tro /Tig = dim; Selp,(X). 


Here X = Mo(k)'*=° is a finite Gg-module with the action given by the 
representation po. 

We shall construct a J-structure on the surjective morphism yg. By Cri- 
terion II of §7.4.6, this will show that Rg and Tg are isomorphic complete 
intersection rings. 

The construction of the J—structure uses Nakayama’s lemma and the 
Chebotarev density theorem (see Theorem 4.22). 

For any m € N, we choose a finite set of primes: 


Qm = {Qm1; aes »Imn} (7.6.13) 


with the properties 


— mj =1mod p™ (==> p™ divides #(Z/qmjZ)*); 
— the eigenvalues of po(Froby,,,,) in k are distinct. 


We begin defining the J-structure using using these finite sets of primes 1’ = 
Qm as follows: 


Am = Ro,,,Bm = Te, A= Ry, B= To. 
We have: 


O[[S1,-++ Sn]]/Im = O[[S1,--+ Sn}]/(m(S1),+++ Wm(Sn)) (7.6.14) 
= O[A; x--- x A,], 


where A; denotes the cyclic subgroup of order p™ in (Z/qm;Z)”. 
We shall consider deformations p of type Dm = Dg,,. 
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Proposition 7.62. Under the notations and assumptions of the Main Theo- 
rem 7.33, there is an isomorphism 


Selg = Selo,, ; 
the O-algebra Am = Rg,, admits n generators. 


Proof of Proposition 7.62 uses an explicit formula for the order of the finite 
group #Selp,, obtained from the cohomological exact sequence of Poitou— 
Tate. We omit the details here, but see 87.5.5. 

In order to have a J-structure notice that for any ¢ = gmj € Qm the action 
of the inertia group J, on X factorizes through the quotient A, of (Z/qZ)”* 
of order p™, by a result of Faltings: 


I, > Gal(Q(Hq)/Q) = (Z/qZ)* — Ag. 
(see Appendix by G. Faltings to [Wi]). Next, notice that 
Zp[Aq] © Zp[[S]] /om(S),wm(S) = (1+ 8)?" — 1, 
and there is an isomorphism 
O[Aam, X °° X Agmnl = Cm = O[[S1,--+ ,Sn]]/Im: (7.6.15) 


where Jn = (Wm(S1),°++ ,Wm(Sn)) C Ol[S1,--- ,Sp]]- According to a result 
of de Shalit on the structure of certain Hecke algebras, By, /Jm Bm is a torsion 
free module of finite rank over C,, (see in [CSS95]; the proof of this result 
uses diamond operators in the Hecke algebras T(Ny)). 

Consider the surjective morphism y : A — B of Theorem 7.61. Then 
the isomorphism shows that y admits a J—structure: there exists a family of 
commutative diagrams, for all m € N: 


O[lS1,--+ Sal] 
OG Sek = eee 
C—O - 


with the following properties: 
i) &m is surjective; 
ii) Ym is surjective; 
iii) Am/JoAm — A, Byn/JoBm <B. 
iv) Bm/JmBm is a torsion free module of finite rank over the O-algebra 


Ol[S1,-++ Sn]]/Im- 


In order to conclude the proof of Theorem 7.61, we apply directly Criterion 
II of 87.4.7: let gp : A — B be a surjective morphisme in the category Co, 
which admits a J-structure. Then » is an isomorphism of two local complete 
intersection O-algebras (see also [Ta-Wij]). 


7.7 End of Wiles’ Proof and Theorem on Absolute 
Irreducibility 


7.7.1 Theorem on Absolute Irreducibility 


In this section we explain Wiles’ deduction of the Shimura~Taniyama-—Weil 
conjecture (Theorem 7.13) from the theorem on modularity of admissible de- 
formations 7.33. 

In order to use Theorem 7.33 one needs to have an absolutely irreducible 
Galois representation po over a finite field k, starting from an elliptic curve. 


Theorem 7.63 (Irreducibility). Let 
E:y? = 42° — gor — 93 


be the Weierstrass form of a semistable elliptic curve over Q. Suppose that the 
Galois representations p3 2 and ps ~ are both reducible. 
Then we have the only four possibilities for jp: 


; 25 52.2413 5.29% 5.2113 
JE 2 ’ 93 ’ 25 ’ 15 ’ 


and E is modular. 


Proof of Theorem 7.63. The modularity of EF in the exceptional cases above 
is checked directly, using for example the Cremona tables [Cre97], (see also 
[Rub95], Lemma 9). 

We shall use modular parameterization of the set of equivalence classes of 
elliptic curves, see §6.3.2, (6.3.13)): 


Io(N)\H — 
(B, Cw) | an elliptic curve over C ~ 
ace together with a cyclic subgroup of order N (isomorphism) 


This set can be identified with the set of C-points of an affine algebraic curve 
Yo(N). The C-points of its Zariski closure Xo(N) (called a modular curve) can 
be identified with the compact quotient I(N)\H. Both curves are defined over 
Q. The boundary 


Xo(N)(C)\Yo(N)(C) —> Fo(W)\(QU o0) 


is the set of cusps (Ip(V)-equivalent classes of parabolic points, defined over 
Q by a theorem of Manin—Drinfeld). Under this identification, rational points 
Yo(.NV)(Q) correspond to the set 


an elliptic curve over Q together with a ~ 
Gg — invariant cyclic subgroup of order N (isom)’ 


{(B.ex) 
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Therefore: 
Xo(N)(Q) — 

an elliptic curve EF over Q with a ~ 
{cusps} U {\ E,Cw) Go-invariant cyclic subgroup of order NV \ i (isom)’ 


Example 7.64. The modular curve Xo(15) has 4 cusps and it is in fact an 
elliptic curve which can be defined by the equation y? = a(x + 37)(x — 47), 
with the following rational points: 


Xo(15)(Q) — | OD 


9,0), (16,0), co 
, +20)), (—36, +180) \ 


Now let f3 7 be a reducible representation, then there exists a Gig-invariant 
cyclic subgroup C3 C E(Q); and if Ps. g is also a reducible representation, then 
there exists a Go-invariant cyclic subgroup Cs C E(Q) of order 5. 

We obtain therefore a Gg-invariant cyclic subgroup C15 = C3+Cs C E(Q), 
which gives one of the four points of type (£, C15) in the set 


X0(15)(Q)\{cusps} 


and an explicit evaluation of the invariants 77 finishes the proof of Theorem 
7.63. 


Proposition 7.65 ([Se95], Proposition 1). Let E be a semistable elliptic 
curve over Q. Then either 


— Po = P3,m 18 surjective or 

— the image po(Ga) is conjugate to a subgroup of {(5 eee 1.€. po ts reducible. 
Proof. Let 6 be the image of po under the projection to PGL2(F3) = G4. 
Each g € Gg maps to a permutation a, € G4 on the set of 4 lines in F3. 


Suppose that neither i) nor ii) holds. Then 6 # Gy, and there are no fixed 
points. But po is odd and 


det (po(g)) = sgn(a,) € {£1} € FF : GLo(F3) — FP. 


Hence © is not contained in 4. It follows that either 6 ~ D4 = ((1234), 
(14)(23)) (the dihedral group), or 6 C D4 is a subgroup of index 2 in D4. In 
both cases, the group 6% = S/S, 6] has order 4. However po is semistable 
so for all 1 4 3, po(Is) ~ ne; iS = 1. Since the order of 6 is not divisible 
by 3, this implies that po(J;) = 1. On the other hand, the Galois group of the 
maximal abelian extension of Q unramified outside 3 is isomorphic to Z>. 
Since it has no factor groups of order 4, contradicting the fact that |6*?| = 4. 


Note that a similar result holds for all p; ~, including | = 5. 
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Lemma 7.66 (J.-P. Serre, [Se95]). Let E be a semistable elliptic curve 
over Q such that ps _ 1s irreducible, then the restriction 


Poleay=a 


is absolutely irreducible. 


Proof of Lemma 7.66: we use the fact that 3 , is odd, and one may assume 
that Im(p3 ~) 5 ce ae Suppose that po|Gg_—s) is not absolutely irreducible, 


and that there exists a subspace W C E[3]@F3 invariant under Gqvy=3) 1G. 
Then W™ is also Ggy=g)-invariant, where 7 is the image of the complex 
conjugation, implying that 


ad = ‘ ; 
PolGary=ay © te =) C GLo(F3) => po(Gacy=3) is commutative. 
But under the assumption of Lemma 7.66, po is surjective due to Proposition 
7.65. 

Therefore, Im(Gg=3)) in PGLa(F3) = G4 is not commutative (since it is 
a subgroup of Gy, of index < 2). This contradiction proves Lemma 7.66. 

Theorem 7.63 becomes in this case the “Theorem on Absolute Irreducibil- 
ity”. In particular, the assumptions of Theorem 7.28 on modularity of admis- 
sible deformations are satisfied for p3 py. 


7.7.2 From p= 3 to p=5 


Let E be a semistable elliptic curve over Q. According to 87.7.1, only the 
following three cases are possible: 


(1) po = P3,~ is irreducible; 
(2) po = Ds, is irreducible; 

: D5 Be OAT 6208 F201" 
@) jee {9 ae}. 


It was noticed in Theorem 7.63 that in the exceptional case (3) E is modular. 
As mentioned after Lemma 7.66, in Case (1) 73 ¢ safisfies the assumptions of 
Theorem 7.28 on modularity of admissible deformations. Hence P3,E, being 
an admissible deformation of 3 7, is modular. This implies the modularity of 
E.. Case (2) is covered by the following 


Theorem 7.67 ([Se95]). Let E be a semistable elliptic curve over Q such 
that Ps ~ is irreducible. Then there exists an elliptic curve E’ over Q such 
that Ps nv ~ Ds.p and ps py 1s absolutely irreducible. 


Proof of the theorem is carried out in §7.7.3. 
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Let us show how this implies Theorem 7.13: The same argument as in 
Case (1) shows that the curve E’ is modular. Hence the representation ps, py 
is modular. However pj is equivalent to ~; ~, hence ps5 p is also modular. 

Moreover, let us check that Ps,@ satisfies the assumptions of Theorem 7.28. 
Knowing that 9; 7 is irreducible, we show that the restriction 


Ps, BIGg ys) 


is absolutely irreducible. 
Indeed, the complex conjugation T € Gos VB) satisfies 


= _ LO 
det7s, (7) = 1 Psxlr)~ (59). 


Hence both eigenspaces of fs _(7) are defined over F5, and the irreductibility 
over Fs implies the absolute irreducibility (over Fs). 

It follows that the assumptions of Theorem 7.28 on modularity of admis- 
sible deformations are satisfied for the representation 95. 

This implies that the representation p5,—_ is modular because it is an ad- 
missible deformation of p; ;. This means that the curve Fis modular, proving 
STW Conjecture also in this case (Theorem 7.13). 

It remains to explain the proof of Theorem 7.67. 


7.7.3 Families of elliptic curves with fixed ps; 7 


We explain a construction of E’ € {E;,} as a fiber of an elliptic fibration 
E, — P} over Q with ; 5, ~ Ds,m for all t and such that E = E,, for some 
to. 

We again use a modular curve, but this time it is attached to the congru- 
ence subgroup I'(5) (see (6.3.15)). Let us use the modular parameterization 
of the set of equivalence classes of elliptic curves, see §6.3.2, (6.3.13): 


r(N)\H— 
(E,4) | an elliptic curve over C together with ~ 
; an isomorphism ¢: E[N] > (Z/NZ)?, ¢*det = en, x (isom) 
where 
en,e: E[N) A E[N] > un 
is the Weil pairing which is algebraically defined by (6.3.31). Recall that over 
C for EF = C/(1, z), one has 
1 z 
N’N 


Moreover, the modular curve X(N) is an algebraic curve defined over the 
cyclotomic field Q(¢w), Cn = exp(27i/N) such that the Riemann surface 
X(N)(C) is identified to the compact quotient ’(N)\H in such a way that 


E[N] = (—, —)/(1,z) & (Z/NZ)?; en zg = exp(+27i(ad — bc)/N). 
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X(N)(C) > P(N)\H,H = HU QU; 
Y(N)(C) —> F(N)\H 


for an affine algebraic curve Y(N) defined over Q(¢n) (see §6.4.2). 

In order to work with curves defined over Q let us fix a curve E over Q 
and consider the twisted curve Xg over Q in such a way that it has an affine 
model Yz over Q defined via the following modular description: 


Yp(C) —— 
(E',¢) | an elliptic curve over C together with ~ 
, an isomorphism ¢ : E’|N] — E[N], ¢*en,.6 = en,5’ (isom) * 


This description gives the set of Q-rational points of Yp: 


(E',4) | an elliptic curve over Q with an isomorphism ~ 
, of Gg-modules ¢ : E’[N] > E[N], d*en.2 = ene" (isom)* 


Note that Xz is isomorphic to X(N) over C (and even over Q(¢yn)). 

We are interested in the case N = 5. In this case we have the following 
explicit description of the curve X~_ by K.Rubin and A. Silverberg (see in 
[RubSil94] and in [CSS95]): 


Proposition 7.68 ([Rub95], Proposition 11). Let E be an elliptic curve 
over Q, and Xz the curve over Q, obtained from X(5) by twist as above. 

Then the genus of Xz is equal to 0, and there is an explicit parameteriza- 
tion: 


w:P! — Xz over Q with t& (Ey, ot) (7.7.1) 


E,:y? =2° + fa(t)e + ge(t), fe, ge € Qe), (7.7.2) 
and deg( fr) = 30, deg(gz) = 20. 


Under this parameterization elements t € Q are in bijection with elements of 


Ye(Q). 


Note that by this construction there is an isomorphism of Gg-modules ¢ : 
E'[5] > E[5], in other words, for any t € Q, fs n, ~ Ds,n,: 

In order to finish the proof of Theorem 7.67, we need to choose an element 
E" = E, in this family (¢ € Q) such that i) 63 », is irreducible and ii) FE; is 
semistable. 


7.7.4 The end of the proof 


In order to obtain an irreducible ps3 7, let us consider an auxiliary twisted 
modular curve X‘, over Q, in such a way that it has an affine model Yj over 
Q defined via the following modular description: 
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Y¥z(C) 
(E', ¢, Cs) | an elliptic curve over C together with 
787 | an isomorphism ¢ : E’[5] > E[5], ¢*es,z = €s,2’,|C3| = 3 


(isomorphism ~) 


This description gives the set of Q-rational points of Yf: 


Yz(Q) 
(E',6,C ) E/9,C3 C E(Q)(Gg-submodule, |C3| = 3) with an isomorphism 
3/1 of Gg-modules ¢ : E’[N] > E[N], ¢*en.2 = ener 


(isomorphism ~) 
Note that Xj, is isomorphic over C to the compact quotient 
(1°(5) 9 To(3))\H. 


The genus of Xj, can be computed with the classical techniques described in 
Chapter I of Shimura’s book [Shi71]; the answer is g(X{,) = 9. 

By the theorem of Faltings (see §5.5) Y;(Q) is a finite set. Thus, for all 
but finitely many ¢ € Q the representation ps3 ,, is irreducible. 

Concerning semistability, let us observe that: 


= _ 1 x 
Ps,6.Ah oe Ps.p\h as 01 : 


This directly implies the semistability for such 1, using the algebraic defi- 
nition of semistability. 

— for | = 5, the semistability at 5 follows from the explicit equation (7.7.1) 
of the curves, via the geometric definition 7.10 of semistability: if t tends 
to to in the 5-adic topology, then the coefficients of the minimal (5-adic) 
equations of FE, converge in the 5-adic topology to the coefficients of the 
minimal (5-adic) equation of E = E,,. Thus FE; becomes semistabe at 5 
for all t € Q sufficiently close to to. 


— for any | £ 5 we have 


The most important insights. 


Here is how they were described by Wiles himself in the introduction to his 

paper [Wij]: 

— the discovery in the spring 1991 of the invariants 74 (known at that time 
as “congruence modules” of Hida in the case of p-adic Hecke algebras); 

— the switch from p = 3 to p= 5 (found in May 1993); 

— the use of the horizontal Iwasawa theory (J—structures) in September 1994. 


An interesting account on the history of this subject is given in [Mozz]. 


Part III 


Analogies and Visions 


ITI-0 


Introductory survey to part III: motivations and 
description 


III.1 Analogies and differences between numbers and 
functions: oo-point, Archimedean properties etc. 


This part was conceived as an explanation of some basic intuitive ideas that 
underlie modern number-theoretical thinking. One subject could have been 
called Analogies between numbers and functions. 


III.1.1 Cauchy residue formula and the product formula 


Let us explain the analogy between a number field, i.e., a finite extension of 
Q, and the field C(S) of meromorphic functions on a smooth complete curve 
S, following [SABK94], p.3. 

An example of this analogy is given by the Cauchy residue formula, 


> =  Res() =0, (III.1.1) 
res res 


where Res, denotes the residue at x of differential forms. 
When f € Q* is a rational number we have the product formula 


fl=[[o?® (II1.1.2) 
Pp 


where p runs over all integral primes and v,(f) € Z is the p-adic valuation of 


If we define 


we may rewrite the equality (III.1.2) as 


S~ wp(f) log(p) + voe(f) = 0, (I11.1.3) 


Pp 
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an analog for Q of equation (III.1.1) for the field C(.S). One can see from this 
example that, in this analogy, the complete curve in S' is analogous to the 
affine scheme Spec(Z) to which is added a point at infinity (at this point the 
Archimedean norm is used instead of discrete valuations). 


III.1.2 Arithmetic varieties 


In general, let X be an arithmetic variety. By this we mean a regular scheme, 
projective and flat over Z. 
In other words, we consider a system of polynomial equations 


fio, +-> , Ew) = fof, --> Ew) = +++ = fm(To.--+ Tn) =0 — (HIT-1.4) 


where f1,-:-,fm € Z[To,--: , Tn] are homogeneous polynomials with inte- 
gral coefficients. These define the projective scheme X = Proj(S), where S 
is the quotient of Z[Tp,--- ,T~] by the ideal, generated by f1,--- , fm. The 
points of X are those homogeneous prime ideal P in S which do not con- 
tain the augmentation ideal. The structure morphism f : X — SpecZ maps 
P to PZ. The fiber of f over a prime integer (special fiber) is the variety 
f7'(pZ) = X/p = Proj($/pS) over the field with p elements. The generic fiber 
is f—'((0)) = Xgq = Proj(S @z Q). We assume that X is regular and that f 
is flat, ie. S is torsion free. It follows that X/p is smooth, except for finitely 
many values of p, like q in Fig-IIJ-0.1, where it may not even be reduced. 

In the same way that we completed SpecZ by adding a point oo to it, we 
“complete” the family X of varieties over SpecZ by adding to it the complex 
variety X,. = X(C), ie. the set of complex solutions of (III.1.4), viewed as 
the fiber at infinity. We think of the whole familly as analogous to a complex 
manifold Y fibered over a smooth complete curve S via a flat proper map 
f:Y — S.If the fibers of f have dimension one, X has Krull dimension two 
(we call it an arithmetic surface). Notice that an integral solution of (III.1.4) 
is a rational point on X(Z) = X(Q) C X(C), ie. a section of f. 


III.1.3 Infinitesimal neighborhoods of fibers 


One can also consider reductions of X (defined over Z) modulo p” for some 
prime p. The limit as n — oo defines a p-adic completion of Xz. This can be 
thought as an “infinitesimal neiborhood” of the fiber at p. 

The picture is more complicated at arithmetic infinity, since one does not 
have a suitable notion of “reduction modulo co” available to define the closed 
fiber. On the other hand, one does not have an analogue of the p-adic com- 
pletion at hand. This is given by the Riemann surface (smooth projective 
algebraic curve over C) determined by the equation of the algebraic curve 
over Q, under the embedding Q Cc C, 


X(C) = X @g Spec(C), 


with the absolute value |-| at the infinite prime replacing the p-adic valuation. 
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III.2 Arakelov geometry, fiber over co, cycles, Green 
functions (d’aprés Gillet-Soulé) 


In order to control the size of the rational points P , S.Yu.Arakelov (cf. 
[Ara74a]) had the brilliant idea of considering Hermitian metrizations of var- 
ious linear objects related to algebraic varieties (invertible sheaves, tangent 
bundles etc.). Let us take an algebraic vector bundle E on X, endowed with a 
smooth Hermitian metric h on the corresponding holomorphic vector bundle 
Ex on Xx). The pair E = (E£,h) will be called a Hermitian vector bundle on 
X. This gives a method to compactify arithmetic schemes over number fields 
at the arithmetical infinity. 

According to Arakelov’s program [Ara74a], [Ara74b], to an algebraic vari- 
ety X over a number field K one can attach a completed arithmetical variety 
X of the dimension dim(X) + 1, 


X — SpecOx = SpecOx U {001,°++ , cor}, 


where Ox is the maximal order of K, {c01,--- ,co,} the set of all places at 
oo of K (see [GS92], [SABK94], [La8g}). 
For an algebraic number field K of degree n = [K : Q| admitting n = 
ry + 2r2 embeddings 
a: KC, (III.2.1) 


there is r = ry + rg Archimedean primes, with r, real embeddings and r2 
paires of complex conjugate embeddings. 

When SpecOx is compactified by adding Archimedean primes 
{coo1,--+ , 00, }, one also obtains n complex varieties X4(C), obtained from 
the embeddings a : K — C. Of these complex varieties, r; carry a real invo- 
lution. 

In particular, each curve has a well defined minimal model over O which 
is called an arithmetical surface (since we added an arithmetical dimension 
to the geometric one). Adding metrics at infinity to this, Arakelov developed 
the intersection theory of arithmetical divisors. Heights in this picture become 
the (exponentiated) intersection index, see [Ara74b], [La88] 

This theory was vastly generalized by H.Gillet and C.Soulé [GS91], [GS92], 
[SABK94], following some suggestions in [Man84]. The Figure II-0.1 which 
is reproduced here from [SABK94] with a kind permission of Ch.Soulé, is a 
visualization of an arithmetical surface. 

Its fibers over the closed points of Spec(O) can be non-singular (“non— 
degenerate”, or with “good reduction”) or singular (having “bad reduction”). 
Rational points of the generic fiber correspond to the horizontal arithmetical 
divisors; there are also vertical divisors (components of closed fibers) and 
“vertical divisors at infinity” added formally, together with an ad hoc definition 
of their intersection indices with other divisors defined via Green’s function, 
see [SABK94]. 
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f-*(p) = X/p X/q > 


og 
|r 
Spec(Z) 


Fig. III-0.1. An arithmetic surface. 


Arakelov’s picture played a prominent role in Faltings’ proof and the sub- 
sequent development of his work. 

The Figure ITI-0.1 which is reproduced here from [SABK94] with a kind 
permission of Ch.Soulé and Cambridge University Press, is a visualization of 
an arithmetic surface. 


III.2.1 Arithmetic Chow groups 


Let X be an arithmetic variety and E a Hermitian vector bundle on X. One 
can attach to E' characteristic classes with values in arithmetic Chow groups. 


More specifically, an arithmetic cycle is a pair (Z,g) consisting of an al- 
gebraic cycle over X, ie. a finite sum }>,NaZa, Na € Z, where Zq is a 
closed irreducible subscheme of X, of fixed codimension p, and a Green cur- 
rent g for Z. By this we mean that g is a real current on X.. which satisfies 
F2,(9) = (—1)?~2g and 

dd°g+6z =w, (III.2.2) 


where w is the current attached to a smooth form on X, and 6z is the current 
given by integration on Z0: 


6 = na | ; TI1.2.3 
z(n) 2 he 1) ( ) 


for any smooth form 7 of appropriate degree (here F=, denotes the morphism, 
induced by the complex conjugation F,, on XQ). 
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The arithmetic Chow group CH’ (X) is the Abelian group of arith- 
metic cycles, modulo the subgroup, generated by pares (0,0u + Ov) and 
(div f, —log|f|?), where wu and v are arbitrary currents of the appropriate 
degree and divf is the divisor of a non-zero rational function f on some irre- 
ducible closed subscheme of codimension p — 1 in X. 


III.2.2 Arithmetic intersection theory and arithmetic 
Riemann-Roch theorem 


The important fact that the arithmetic Chow groups CH’ (X) have fonctorial 
properties, is studied in [SABK94]. 

Given two arithmetic cycles (Z,g) and (Z’, 9’), we need a Green current 
for their intersection. The formula 


g =wg' + gbz! 


where w is defined as in (III.2.2), involves a product of two currents gdz’. To 
make sense of it in general we need to show that one can take for g a smooth 
form on Xo — Zoo of logarithmic type along Zo. 

Then one can define characteristic classes for Hermitian vector bundles E 
on X, in particular, a Chern character class 


ch(E) € PCH (X) @Q. (III.2.4) 
>0 


This class satisfies the usual axiomatic properties of a Chern character, but it 
depends on the choice of a metric on E. It is additive only on orthogonal direct 
sums, but in general its failure to be additive on exact sequences is given by 
the secondary characteristic class of Bott-Chern. Similar results hold for the 
Todd class Td(E) (cf. Chapter IV of [SABK94]). The arithmetic Riemann- 
Roch theorem is formulated in terms of these classes, for a proper flat map 
f : X — Y, as a formula for the first arithmetic Chern class for the direct 
image map on Hermitian vector bundles. 

Main application of this result gives an asymptotic behaviour of the type 


EME) Fatt 644 LOA log n) (III.2.5) 


h°(X,E@L")> 


for an ample L with a positive metric, where h°(X, E @ i) is the logarithm 
of the number of global sections s € H®(X,E @ L") such that ||z|| < 1 for 


every © € Xoo, and ‘a € R denotes the arithmetic self—intersection. 

The arithmetic Riemann-Roch theorem was discussed in [Fal92]; it was ex- 
plained that the Hirzebruch-Riemann-Roch theorem (HRR) gives a formula 
for x(E), E — X an algebraic vector bundle over an algebraic manifold X, 
in terms of characteristic classes of X and E: x(£) = [Td(X) - ch(E)]aim x. 
The Grothendieck-Riemann-Roch (GRR) theorem, a relative version, states 
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that for an algebraic morphism f: X — Y and an algebraic vector bun- 
dle EF on X, one has an equality of mixed cohomology classes: ch(f,E£) = 
f.(Td(X|Y)-ch(£)), and contains HRR as a special case where Y = point. 
The proof of HRR is by means of elliptic differential operators and analytic 
methods or cobordism. The proof of GRR, on the other hand, is easier and 
more algebraic. 

This proof was carried in [Fal92] over to the arithmetic case. 

The data for the arithmetic case are as follows: f: X — Y is a morphism 
of arithmetic varieties, both of which are defined over a common arithmetic 
variety B (such as Spec(O,)), & an arithmetic vector bundle on X. One 
needs the following objects: (1) K (x F equivalence classes of B s, (2) A*(X ), 
the arithmetic Chow ring, (3) ch: K(X) > A*(X), (4) fa: K(X) > K(Y), 
the arithmetic push forward, (5) a class Td®(X|Y) € A*(X), and (6) an 
intersection product in A*(X). Then the arithmetic Riemann-Roch theorem 
(ARR) states: ch(f,£) = f.(Td®(X|Y)-ch(B)) in A(Y) for B € K(X). The 
HRR case of this is now Y = B = Spec(Ox), and for surfaces (Xx a curve), 
the component in degree 2 yields the RR for arithmetic surfaces by G. Falt- 
ings in [Fal84] the first case for which ARR was proven. Here the method 
was construction of volume forms on the cohomology. Following Faltings’ in- 
troduction, this led to new interest in this topic, and soon there was rapid 
progress: Deligne generalised the volume form to more cases, and Gillet and 
Soulé developed an arithmetic intersection theory for general varieties, as well 
as Hermitian K-theory. Then they joined efforts with Bismut and managed to 
define the determinant of cohomology of an arithmetic variety, cf. [BiGS88]. 
One could then ask for a RR result on this determinant. It turned out that 
even for the projective space the immediate generalisation of the classical RR 
was false. To remedy this they introduced a secondary class R(x) so that a 
modified version (using R) remains true. Bismut and Lebeau could prove a 
RR result for closed immersions, from which the RR for determinant bundles 
follows with some more work, see in [SABK94]. 


The theory of intersections and arithmetic Riemann-Roch theorem were 
discussed also in a Bourbaki talk [Bo90] by J.-B. Bost (see also [BoCo95] ). 
III.2.3 Geometric description of the closed fibers at infinity 


A general picture of an arithmetic surface over Spec(Ox) is as follows: 


X» ro Xgpec(Ox) raat Xpec(Ox) amet 


i { . { 


p — Spec(Ox) — Spec(Ox) — a 


where we do not have an explicit geometric description of the closed fibers 
over the Archimedean primes. 
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Formally one can enlarge the group of divisors on the arithmetic surface 
by adding formal real linear combinations of irreducible “closed vertical fibers 
at infinity” 5°. Ao Fa. Here the fibers F,, are only treated as formal symbols, 
and no geometric model of such fibers is provided. The remarkable fact is that 
Hermitian geometry on the complex varieties X.(C) is sufficient to specify the 
contribution of such divisors to intersection theory on the arithmetical surface, 
even without an explicit knowledge of the closed fiber. 

The main idea of the Arakelov geometry is that it is sufficient to work 
with the “infinitesimal neighborhood” X.(C) of the fibers Fy, to have well 
defined intersection indices. 

From the point of view of the classical geometry of generations of algebraic 
curves over a disk A with a special fibrer over 0, the analogous statement 
would say that the geometry of the special fiber is completely determined by 
the generic fiber (cf. [Mar04], §3 of Chapter 3). This would be a very strong 
statement on the form of the degeneration: for instance blowing up points on 
the special fiber is not seen by just looking at the generic fiber. Investigating 
this analogy leads one to expect that the fiber at infinity should behave like 
in the totally degenerate case, cf. a discussion in 88.2.3. This is the case where 
one has maximal degeneration, where all the components of the closed fiber 
are P!’s and the geometry of the degeneration is completely encoded by the 
dual graph, which describes in a purely combinatorial way how these P!’s are 
joined. The dual graph has a vertex for each component of the closed fiber 
and an edge for each double point. 

For an arithmetic surface (dim X = 1), the local intersection multiplicities 
of two finite, horizontal, irreducible divisors D; and Dz on Xo,, is given by 


[D1, D2] = [D1, Do] fin + (D1, Daling, 


where the first term counts the contribution from the finite places (i.e. what 
happens over S'pec(Ox)), and the second term is the contribution of the 
Archimedean primes, i.e. the part of the intersection that happens over arith- 
metic infinity. 

While the first term is computed in algebro geometric terms, from the 
local equations for the divisors D; at P, the second term is defined as a sum 
of values of Green functions gq on the Riemann surfaces X_(C), 


[Di, Dalint = — Y_ &a do 9a Phas PSy) | 5 
a Byy 


at points 
{Pie |G =1,...,[KWDi) : K]} C Xa(C), 


for a finite extension K(D;) of K, determined by D;. Here ¢, = 1 for real 
embeddings and =2 for complex embeddings (see [CS86], [GS92], [SABK94], 
[La88] for a detailed account of these notions of Arkelov geometry). 
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Further evidence for the similarity between the Archimedean and the to- 
tally degenerate fibers came from an explicite computation of the Green func- 
tion at the Archmedean places derived in [Man91] in terms of a Schottky 
uniformization of the Riemann surface X.(C), which is discussed in more 
detail in 88.1. Such uniformization has an analogue at a finite prime, in 
terms of p-adic Schottky groups, only in the totally degenerate case. Another 
sourse of evidence comes from a cohomological theory of the local factors at 
Archimedean primes, developed by K.Consani in [Cons98], valid for any arith- 
metic variety, see also 88.2 for more details. This general construction shows 
that the resulting description of the local factor as regularized determinant at 
the Archimedean primes resembles mostly the case of the totally degenerate 
reduction at a finite prime. 

One can present both results in the light of the noncommutative space 
given by a spectral triple (O4,H,D) discussed in §8.3, where the data 
(O4,H,D) consist of a C*-algebra O4 (or more generally of a smooth sub- 
algebra of a C*-algebra) with a representation as bounded operators on a 
Hilbert space # and an operator D on H that verifies the main properties 
of a classical Dirac operator D (a square root of the Laplacian) on a smooth 
spin manifold. 


III.3. Theory of ¢-functions, local factors at oo, Serre’s 
I’-factors; and generally an interpretation of zeta 
functions as determinants of the arithmetical Frobenius: 
Deninger’s program 


(cf. [Se70a], [Den94], [Den01], [Lei03], and §7, Ch.3 of [Mar04]). 

An important invariant of arithmetic varieties is the L-function. This 
is written as a product of contributions from the finite primes and the 
Archimedean primes, 


IT] 2(4"(%),»), (11.3.1) 
peESpecOx 


see §6.2.7. The reason why one needs to include the contribution of the 
Archimedean primes can be seen in the case of the “affine line” Spec(Z), where 
one has the Riemann zeta-function, which is written as the Euler product 


¢(s) =[Ja-p*)*. (111.3.2) 


Pp 


However to have a nice functional equation, one needs to consider the product 
C(s) = C(s)I'(s/2)n—*/?, (II1.3.3) 


which includes a contribution of the Archimedean prime, expressed in terms 
of the Gamma function. 
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An analogy with algebraic geometry and Weil’s conjectures (see §6.1.3) 
suggest to think of the functional equation as a sort of “Poincaré duality” 
which holds for a compact manifold, hence the need to “compactify” by adding 
the Archimedean primes (and the Archimedean fibers of arithmetic varieties). 

When one looks at an arithmetic variety over a finite prime p € Spec(Ox), 
the fact that the reduction lives over a residue field of positive characteristic 
implies that there is a special operator, the geometric Frobenius Fry acting on 
a suitable cohomology theory (étale cohomology), induced by the Frobenius 
automorphism ¢, of Gal(F,/F,). 

The local L-factors of (III.3.1) at finite primes encodes the action of the 
geometric Frobenius in the form 


Lp(H™(X), 8) = det (1 —Fr3N(p)~°|H"(X,Q)") (III.3.4) 


Here we are considering the action of the geometric Frobenius Bre on the in- 
ertia invariants H™(X,Q;)’* of the étale cohomology and a precise definition 
of these arithmetic structures is beyond our purpose. We wish to describe 
the contribution of the Archimedean primes to the product (III.3.1) by giving 
some quick heuristic explanation of (III.3.4). 

For X a smooth projective variety (in any dimension) defined over Q, the 
notation X := X @ Spec(Q) is used. One can write the local L factor (III.3.4) 
equivalently as 


. mre I 
cous %Q1) (II1.3.5) 


L(H™(X),s)= J] (-r\c 


AESpec(Fr} ) 


where H™(X, Q)? is the (generalized) eigenspace of the Frobenius with 
eigenvalue X. 
An important conclusion is that the local Z-factor in (III.3.5) depends 
upon the data 
(H*(X,Qu)’”, Frs) (III.3.6) 
of a vector space, which has a cohomological interpretation, together with a 


linear operator acting upon it. 


ITI.3.1 Archimedean L-factors 


Since the étale cohomology satisfies the compatibility isomorphism with the 
singular (Betti) cohomology: 


H'(X,Q,) = H*(X(C),Q) ®@Q, (III.3.7) 


this indicates that one can work with the smooth complex manifold X(C) 
and gain information on the “closed fiber” at arithmetic infinity, and expect 
that the contribution of the Archimedean primes to the Z-function may be 
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expressed in terms of the cohomology H*(X(C),Q) (or in terms of the de 
Rham cohomology with complex coefficients). 


Let us recall that the expected contribution of the Archimedean primes 
(cf. [Se70a]) is determined by I’-factors attached to the Hodge structure 


H™(X(C)) = @ HP (X(C). 


Namely, one has the following product of Gamma functions 
L(H,s) = (III.3.8) 
I] p,q Ze(s — min(p, q))""" 
Tp<q f(s — py" TI, Pa(s — p)”" Ta(s— p+ 1) 
where H = H™(X(C)), s € C, h?-4 = dimc H”? and h? = is the dimension 


of the +(—1)?-eigenspace of the C-linear involution F, on H. Recall that by 
definition 6.2.4, 


j we dt 
Ic(s) = (27) -*I(s),  Ip(s) = abn (5), where I(s) =| ee. 
0 
Let us try to seek a unified picture of what happens at the finite and at the 
infinite primes. In particular, these should be a suitable reformulation of the 
local factors (III.3.4) and (III.3.8) where both formulae can be expressed in 
the same way. 


III.3.2 Deninger’s formulae 


Deninger in [Den91], [Den92] and [Den94] expressed both local factors (III.3.4) 
and (III.3.8) as certain infinite determinants. 

Recall that the Ray-Singer determinant of an operator T with pure point 
spectrum with finite multiplicities {m)}yespec(r) is defined as 


d 
det(s — T) := exp (-Zerts ‘\|e-0) (III.3.9) 
where the zeta function of T is defined as 


Gr(s,z)= SY) my(s—A)~*. (I1I.3.10) 


AESpec(T) 


Suitable conditions for the convergence of this expressions in the case of the 
local factors are described in [Man95]. Ch. Deninger showed that (III.3.4) can 
be written equivalently in the form 
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Ly(H™(X), s)~* = det (s — @,), (III.3.11) 
[oe) 


for an operator with spectrum 


Spec(s — O,) = (III.3.12) 
2ni (loggq ; 
{= (se (s— ay) 4 n) me Z,A€ Spec(T), 


with multiplicities d, and with q™ = 4. 
Moreover, the local factor (III.3.8) at infinity can be written similarly in 
the form 


L(H™ (Xx), 3)" = det Gc - hen ; (III.3.13) 


where H™ is an infinite dimensional vector space and @ is a linear operator 
with spectrum Spec(®) = Z and finite multiplicities. This operator is regarded 
as a “logarithm of Frobenius” at arithmetic infinity. 

Given Deninger’s formulae (III.3.12) and (III.3.13), it is natural to ask for 
a cohomological interpretation of the data 


(H"™,®), (I1I.3.14) 


somewhat analogous to the non-Archimedean data (III.3.6). 


III.4 A guess that the missing geometric objects are 
noncommutative spaces 


We shall see that cohomological interpretation of the above data (H™,®) in 
(III.3.14) leads to the notions of a spectral triple and a noncommutative space, 
see §8.2. 


IIJ.4.1 Types and examples of noncommutative spaces, and how to 
work with them. Noncommutative geometry and arithmetic 


Let us follow Chapter 1 of [Mar04], and recall some basic notions of Noncom- 
mutative geometry, developed by Connes, cf. [C094], [Co2000], which extends 
the tools of ordinary geometry to treat spaces that are quotients, for which 
the usual “ring of functions” (i.e. functions invariant with respect to the equiv- 
alence relation) is too small to capture the information on the “inner struc- 
ture” of points in the quotient space. Typically, for such spaces functions on 
the quotients are just constants, while a non-trivial ring of functions, which 
remembers the structure of the equivalence relation, can be defined using a 
noncommutative algebra of coordinates, analogous to the noncommutative 
variables of quantum mechanics. 
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Following A.Connes (cf. [C094], p.85) let us give a simplest example of a 
noncommutative quotient space, and consider the set Y = {a, b} consisting of 
two elements a and 6, so that the algebra C(Y) of complex-valued functions 
on Y is the commutative algebra C 6 C of 2 x 2 diagonal matrices. There are 
two ways of declaring that the two points a and b of Y are identical under an 
equivalence relation a ~ b: 


1) The first method is to consider the subalgebra A C C(Y) of functions on 
Y which take the same value at a and b: f(a) = f(b). 
2) The second method is to consider a larger algebra B > C(Y) of all 2 x 2 


matrices: 
( Tae fab ) 
foa foo 


(these are functions on the graph of the equivalence relation). 

The relation between the two algebras is given by the notion of strong 
Morita equivalence of C*-algebras (cf. [Rief76]). This relation which resembles 
the Brauer equivalence (see §4.5.5) preserves many invariants of C*-algebras, 
such as K-theory and the topology of the space of irreducible representations. 
One can interpret then 

We = faa, Wb = Soo 
as pure states of B in the sense of quantum mechanics, which yield equivalent 
irreducible representations of B = M2(C). 

Note that if a and 6 are not equivalent, one obtains by the second method 
only the algebra of diagonal matrices, because the graph of the equivalence 
relation consists then just of (a,a) and (0, d). 

Let us describe another simple example (cf. [Co94], p.87, and [Mar04], §2 
of Ch.1), in which the above two algebraic operations of quotient 1) and 2) 
yield obviously different (even not strongly Morita equivalent) algebras. 

Take the topological space Y = [0, 1] x {0,1} with the equivalence relation 
R: (a,0) ~ (a,1) for x € (0,1), and let X = Y/R be the (non-Hausdorff) 
quotient space. That is, Y = J, U Jz is the disjoint union of two copies J; and 
Ty of the interval [0, 1], and the quotient space X = Y/R is obtained by gluing 
the two interiors of the intervals J; and Jz but not the end points. 

By the first method take continuos functions f on Y, invariant with respect 
to the equivalence relation, this is the algebra A = C([0,1]), which is homo- 
topic to C, and Ko(A) = Z. By the second method let us consider functions 
on the graph of the equivalence relation, then one obtains the algebra 


B={f € C((0,1]) @ Ma(C) | f(0) and f(1) are diagonal } 


which is an interesting nontrivial algebra (we view a generic element x € 
Ma2(C((0, 1])) as a continous map t + x(t) € Ma(C)). Note that the space of 
irreducible representations of B is X = Y/R, and the K theory of B is much 
less trivial than that of A = C((0,1)). 

In general, such “quantum spaces” are defined by extending the Gelfand— 
Naimark correspondence 
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X loc. compact Haussdorf space <> Co(X) Abelian C*-algebra 


by dropping the commutativity hypothesis in the right hand side. The cor- 
respondence then becomes a definition of what’s on the left hand side: a 
noncommutative space. 

The idea of preserving the information on the structure of the equivalence 
relation in the description of quotient spaces has analogues in Grothendieck’s 
theory of stalks in algebraic geometry. Such quotients arise from foliations, 
and, more recently, in number theory and in arithmetic geometry, starting 
from [BoCo95], where a noncommutative space related to class field theory is 
constructed. This space, viewed as the space of 1-dimensional Q-lattices up 
to commensurability, relates the phenomena of spontaneous symmetry break- 
ing in quantum statistical mechanics to the mathematics of Galois theory. A 
similar noncommutative space was used in [Co2000a] and [Co99] to obtain a 
spectral realization of the zeroes of the Riemann zeta function. Some other 
recent examples (cf. [CoMo04] , [CoMo04al) interpret differential operators 
on modular forms (Rankin—Cohen brackets) in terms of the Hopf algebra of 
a noncommutative space of codimension one foliations. It turns out that the 
modular Hecke algebra appears as the “holomorphic part” of the algebra of a 
certain noncommutative space (cf. [CoMar04]). 

Let us only mention the noncommutative elliptic curves, and noncommu- 
tative modular curves cf. op.cit. 

We add to this list other interesting examples coming from the construction 
in [Man91], giving a description of the totally degenerate fibers at “arithmetic 
infinity” of arithmetic varieties over number fields, cf. [CM]. We describe in 
Chapter 8, how Connes theory gives a link of this construction with Deninger’s 
approach [Den91], who suggested to reinterpret Serre’s gamma-factors of zeta- 
functions as infinite regularized determinants of certain oo-adic Frobenius 
maps acting upon new cohomological spaces. These spaces are described in 
§8.2. 


Isomorphism of noncommutative spaces and Morita equivalence 


In noncommutative geometry, isomorphisms of C*-algebras are too resrictive 
to provide a good notion of isomorphism of noncommutative spaces, and the 
correct notion is provided by Morita equivalence of C*-algebras (cf. [Man02al, 
81.3): 


Definition III-0.1 (Morita category). Let A, B be two associative rings. A 
Morita morphism A — B by definition, is the isomorphism class of a bimodule 
AMgpg, which is projective and finitely generated separately as module over A 
and B. 

The composition of morphisms is given by the tensor product AMp®pMo, 
or AM ® BMb¢ for short. 


If we associate to 4Mgp the functor 
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Mod, Baza Modp : Na b> N®aA Mp, 


the composition of functors will be given by the tensor product, and isomor- 
phisms of functors will correspond to the isomorphisms of bimodules. 

We imagine an object A of the (opposite) Morita category as a noncom- 
mutative space, right A-modules as sheaves on this space, and the tensor 
multiplication by 4Mg as the pull-back functor. We have chosen to work 
with right modules, but passing to the opposite rings allows one to reverse 
left and right in all our statements. 

Two bimodules 4Mz and gN, supplied with two bimodule isomorphisms 
AM ®p Na — 4Aa and BN ®, Mp - BBez define mutually inverse Morita 
isomorphisms (equivalences) between A and B. The basic example of this kind 
is furnished by B = Mat (n, A), M = 4A"p and N = pA”y. 


We will now briefly summarize Morita’s theory. 


(A) Characterization of functors S : Moda — Modg of the form Na 
N@®a Mg. They are precisely functors satisfying any of the two equivalent 
conditions: 

(i) S is right exact and preserves direct sums. 

(ii) S admits a right adjoint functor T : Modg — Mod, (which is then 
naturally isomorphic to Homp(Mz, *)). 
We will call such functors continuous. 

(B) Characterization of continuous functors S such that T is also continuous 
and ST = 1. Let S be given by 4Mz and T by gNy. Then M @®pN & 
AAAs. Moreover, in this case 
(iii) Meg and BN are projective. 

(iv) aM and Na are generators. 

In particular, equivalences Mod, — Modg are automatically continuous. 
Hence any pair of mutually quasi-inverse equivalences must be given by 
a couple of biprojective bigenerators as above. 

(C) Finite generation and balance. Any right module Mz can be considered 
as a bimodule 4Mz where A = B’ := Endp(Mz). We can then similarly 
produce the ring B” = A’ := End4(4M). Module Mg is called balanced if 
B" = B. Similarly, one can start with a left module. With this notation, 
we have: 


(v) Mz is a generator iff pM is balanced and finitely generated projective. 
Properties (i)—(v) can serve as a motivation for our definition of the Morita 
category above. 
The tools of noncommutative geometry 


The following is a list of some techniques used in order to compute invariants 
and extract essential information from the geometry. 
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— Topological invariants: K-theory 

Hochshild and cyclic cohomology 

— Homotopy quotients, assembly map (Baum—Connes) 
Metric structure: Dirac operator, spectral triples 

— Characteristic classes, zeta functions 


We recall some of these notions in Part III, starting with Dirac operator 
and spectral triples. 

In noncommutative geometry, the notion of a spectral triple provides the 
correct generalization of the classical structure of a Riemannian manifold. The 
two notions agree on a commutative space. In the usual context of Riemannian 
geometry, the definition of the infinitesimal element ds on a smooth spin man- 
ifold can be expressed in terms of the inverse of the classical Dirac operator 
D. This is the key remark that motivates the theory of spectral triples. In 
particular, the geodesic distance between two points on the manifold is de- 
fined in terms of D~+ (cf. [Co94] §VI). The spectral triple that describes a 
classical Riemannian spin manifold is (A, H, D), where A is the algebra of 
complex valued smooth functions on the manifold, H is the Hilbert space 
of square integrable spinor sections, and D is the classical Dirac operator (a 
square root of the Laplacian). These data determine completely and uniquely 
the Riemannian geometry on the manifold. 

The notion of spectral triple extends to more general noncommutative 
spaces, where the data (A,H,D) consist of a C*-algebra A (or more generally 
of a smooth subalgebra of a C*-algebra) with a representation as bounded 
operators on a Hilbert space H, and an operator D on #1 that verifies the 
main properties of a Dirac operator. 


III.4.2 Generalities on spectral triples 


Let us recall the basic setting of Connes theory of spectral triples. For a more 
complete treatment we refer to [C095], [Co94], [CoMo95]. 


Definition III-0.2. A spectral triple (A,H, D) consists of an involutive alge- 
bra A with a representation 


p:A— B(H) 
as bounded operators on a Hilbert space H, and an operator D (called the 
Dirac operator) on H, which satisfies the following properties: 


1. D is self-adjoint. 
2. For all \ ¢ R, the resolvent (D — \)~' is a compact operator on H. 
3. For alla € A, the commutator [D, a] is a bounded operator on H. 
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Remark III-0.3. The property 2. of Definition III.1.1 generalizes ellipticity of 
the standard Dirac operator on a compact manifold. Usually, the involutive 
algebra A satisfying property 3. can be chosen to be a dense subalgebra of a 
C™*—algebra. This is the case, for instance, when we consider smooth functions 
on a manifold as a subalgebra of the commutative C*-algebra of continuous 
functions. In the classical case of Riemannian manifolds, property 3. is equiv- 
alent the Lipschitz condition, hence it is satisfied by a larger class than that 
of smooth functions. 


IIJ.4.3 Contents of Part III: description of parts of this program 


We included under the heading of this part various recent topics related to 
Arakelov’s geometry, and Noncommutative geometry. All these topics use a 
lot of algebraic tools such as cohomology groups and non-commutative rings. 


We start Chapter 8 with §8.1 on Schottky uniformization and Arakelov 
geometry, based on geometric constructions in [Man91]. We give an ana- 
lytic construction of degenerating curves over complete local fields, following 
[Mum72], in §8.1.2. Then in §8.1.5 we describe the result in [Man91] on the 
relation between the Arakelov Green function on a Riemann surface X(C) 
with Schottky uniformization and geodesics in the 3-dimensional hyperbolic 
handlebody Xp. 

Following [CM], we describe in this chapter, how Connes theory gives a 
link of theses constructions with Deninger’s approach [Den91], who suggested 
to reinterpret Serre’s gamma-factors of zeta-functions as infinite regularized 
determinants of certain oco-adic Frobenius maps acting upon two types of 
cohomology spaces. Archimedean cohomology spaces are described in §8.2, 
and dynamical cohomology spaces are described in §8.3. 

In §8.2 we describe a cohomological theory for the Archimedean fiber of 
an Arakelov surface. This results are based on the general theory, valid for 
any arithmetic variety, developed in [Cons98] (we follow an interesting dis- 
cussion of the Archimedean cohomology in [CM04b] and [Mar04], Chapter 3). 
This construction provides a refinement for the Archimedean cohomology H;,. 
introduced by Deninger in [Den91]. 

Based on this construction and following [CM], we describe in §8.2.2 a 
cohomological spectral data (A, H'(X*),®), where the algebra (A is obtained 
from the SL(2,R) action on certain cohomology groups. 

In Theorem 8.3 we describe how to recover the alternating product of 
the Archimedean factors from a zeta function of a spectral triple. In §8.3 a 
different construction is described, which is related to description in [Man91] 
of the dual graph of the fiber at infinity. A geometric model is given here for 
the dual graph as the mapping torus of a dynamical system T on a Cantor 
set. 


Ill.4 A guess that the missing geometric objects are noncommutative spaces 413 


We consider a noncommutative space which describes the action of the 
Schottky group on its limit set and parameterizes the “components of the 
closed fiber at infinity”. This space is represented by a Cuntz—Krieger algebra 
Oa, described in §8.3.5. 

Next, we describe a spectral triple for this noncommutative space, via a 
representation on the cochains of a “dynamical cohomology”, defined in terms 
of the tangle of bounded geodesics in the handlebody. In both constructions 
presented in Chapter 8, the Dirac operator agrees with the grading opera- 
tor &, that represents the “logarithm of a Frobenius—type operator” on the 
Archimedean cohomology. In fact, the Archimedean cohomology embeds in 
the dynamical cohomology, compatibly with the action of a real Frobenius 
F,,, so that the local factor can again be recovered from these data. More- 
over, the “reduction mod infinity” is presented in §8.4 in terms of the homo- 
topy quotient associated to the noncommutative space O,4 and the u-map of 
Baum-—Connes, cf. [BaCo]. 


Suggestions for further reading to Part III 


Works of A.Connes on Trace formula in noncommutative geometry and the ze- 
ros of the Riemann zeta function, [Co99], [Co2000a], work of J.B. Bost [Bo01] 
on algebraic leaves of algebraic foliations over number fields, and [BoCo95], 
[CoMar04], [CoMo95], [CoMo04], [Ber86] sheading new light on relations of 
physics with number theory using noncommutative geometry (see also nice 
papers of P.Cartier [Car95], [Car01], [Car02] on related subjects). 

We refer also to Number Theory and Physics archive at 
http: //www.maths.ex.ac.uk/“mwatkins/zeta/physics.htm 
where one can find very useful material and bibliography on relations with 
quantum mechanics, statistical mechanics, p-adic and adelic physics, Selberg 
trace formula, string theory and quantum cosmology, scattering theory, dy- 
namical and spectral zeta functions, trace formulae and explicit formulae, 1/f 
noise and signal processing, supersymmetry, QCD, renormalisation, symme- 
try breaking and phase transitions, quantum fields, integer partitions, time, 
biologically-inspired and similarly unconventional methods for finding primes, 
dynamical systems, entropy, specific zeta values, logic, languages, information, 
etc., probability and statistics, noncommutative geometry, random matrices, 
Fourier theory, fractal geometry, Bernoulli numbers, Farey sequences, Beurl- 
ing g-primes, golden mean, zeta functions and L-functions. 
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Arakelov Geometry and Noncommutative 
Geometry (d’aprés C. Consani and M. Marcolli, 
[CM]) 


8.1 Schottky Uniformization and Arakelov Geometry 


8.1.1 Motivations and the context of the work of Consani-Marcolli 


Our primary motivation in this chapter is a desire to enrich the somewhat 
formal picture of Arakelov’s geometry at arithmetical infinity (see [Ara74b], 
[La88], [GS92], [Man84], and 85.2.6). We try to describe geometric and alge- 
braic objets which play the roles, respectively, of the “oo-adic completion” of 
a completed arithmetical Arakelov variety, of its “closed fiber at infinity’, and 
the “reduction modulo oo”, in the spirit of the work of Mumford [Mum72]. We 
follow the paper [Man91], and recent works [CM], [CM03], [CM04a], [Cons98], 
which clarified much at the arithmetical infinity. 


We try to explain then how to use Connes’ theory of spectral triples (see 
[Co95], [Co99], [Co94], and §8.3) in order to relate the hyperbolic geometry 
to Deninger’s Archimedean cohomology. 

We recall the beginning of a dictionary which translates classical notions 
into the laguage of operators in the Hilbert space 7: 


Complex variable Operator in 7 
Real variable Self-adjoint operator (8.1.1) 
Infinitesimal Compact operator 


From the arithmetic point of view, algebraic numbers appear in commu- 
tative geometry as values of algebraic functions, whereas in noncommutative 
geometry they appear as values of traces of projections, or more generally 
values of appropriate states on observables. In both cases, a control of the 
action of the Galois group is gained, if this action commutes with an action of 
certain “geometric” endomorphisms, or correspondences, whenever the latter 
are defined over the ground field. 
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Interesting examples of noncommutative spaces come from geometric con- 
structions in [Man91]. 


The choice of structures used in this section is related to Mumford’s idea 
(see [Mum72], and §8.1.2) that p-adic curves with maximally degenerate re- 
duction admit an (essentialy unique) p-adic Schottky uniformization. On can 
use his geometric picture in an Archimedean setting. 


In the paper [Man91] a filling ¥ of X was constructed, which was an aux- 
iliary three dimensional manifold endowed with a metric of constant negative 
curvature, such that X is its boundary, and which can be defined by a Schottky 
uniformization, described in §8.1.3. 

It was also proven in [Man91]| that Green’s function on X can be expressed 
in terms of the geometry of geodesics on X. 

An interesting analogy was suggested, in which the set of all closed geodes- 
ics in X¥ and the set of geodesics with one end at X play the roles, respectively, 
of “oo-adic completion” of X, its “closed fiber at infinity”, and the “reduction 
modulo oo”, in the spirit of the work of Mumford ([Mum72]). 


Preliminary notions and notation 


Throughout this section let us denote by kK one among the following fields: 
(a) the complex numbers C, (b) a finite extension of Q,. When (b) occurs, we 
write Ox for the ring of integers of kK, m C Ox for the maximal ideal and 
ma € m for a uniformizer (i.e. m = (7)). We also denote by k the residue classes 
field k = O/m. 


We denote by H’ (or simply by H? = H’) the three-dimensional real hy- 
perbolic space i.e. the quotient 


H’ = SU(2)\PGL(2, C). 


This space can also be described as the upper half space H’ ~ C x R* endowed 
with the hyperbolic metric. The group PSL(2,C) acts on H!’ by isometries. 
The complex projective line P!(C) is identified with the conformal boundary 
at infinity of H’ and the action of PSL(2,C) on H!’ extends to an action on 
H’ := H’ UP‘(C). The group PSL(2,C) acts on P!(C) by fractional linear 


transformations. 


8.1.2 Analytic construction of degenerating curves over complete 
local fields and Arakelov geometry (d’aprés Mumford [Mum72]) 


The idea of investigating the p-adic analogues of classical and abelian vari- 
eties is due to John Tate [Ta74], who showed that if K is a complete non- 
Archimedean local field, and E£ is an elliptic curve over K whose j-invariant 
is not an integer, then E can be analytically uniformized. This uniformization 
is not a holomorphic map: 
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qm: ARE 


generalizing the universal covering space 
a:C-— E(= closed points of an elliptic curve over C), 
but instead is a holomorphic map: 
m1: Ax\{0} —~ E 


generalizing an infinite cyclic covering 72 over C: 


Cc* 
C x E 


m1(z) we ere, 


w is one of the two periods of E. Here one can take holomorphic map to mean 
holomorphic in the sense of the non-Archimedean function theory of Grauert 
and Remmert [GR71]. But the uniformization 72 is more simply expressed by 
embedding F in P?, and defining the tree homogeneous coordinates of 7(z) 
by three everywere convergent Laurent series. 

In order to explain what happens for curves of higher genus, let us present 
a bit further the interesting analogies between the real, complex and p-adic 
structures PG'L(2) (as developped by Bruhat, Tits and Serre, see [Se71]): 


(A) real case: PS'L(2,R) acts isometrically and transitively on the upper half 
plane H and the boundary can be identified with RP! (the real line plus 


SLLLLL / 


az+b ds? = : (dx? + dy”) 
cz+d y 


Ze 


Fig. 8.1. 
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coordinates zeC, xe R, x20 


J 
metric ds? = ~ , (|dz|? +x?) 
x? 


Fig. 8.2. 


(B) complex case: PGL(2,C) acts isometrically and transitively on the upper 
half space H’ and the boundary can be identified with CP!: The action of 
PSL(2,C) on H’ is given by 


ean (spite x 


lez + dl? + |el2a2_—” Jez + dl? + |e|?a? 


(C) p-adic case: PG'L(2, K) acts isometrically and transitively on the Bruhat- 
Tits tree A, (whose vertices correspond to the subgroups gPGL(2,Ox«)g~', 
and whose edges have length 1 and correspond to the subgroups 
gPGL2,OK)9, {(24) |a,b..d€ Ox,cemad¢ém} modulo Oj) 
and the set of whose ends can be identified with KP! (see [Se77], and 
[Mum72], p. 131). 


[the case card (k) = 3]. 


Fig. 8.3. 


In A, for any vertex v the set of edges meeting v is naturaly isomorphic to 
kP', the isomorphism being canonical up to an element of PGL(2,k) (where 
k= Ox/m). 
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In the first case, if [ C PSL(2,R) is a discrete subgroup with no ele- 
ments of finite order such that PS'L(2,R)/I° is compact, we obtain Koebe’s 
uniformization 

HoH/r=X 
of an arbitrary compact Riemann surface X of genus g > 2. 

In the second, if [ C PSL(2,C) is a discrete subgroup which act discon- 
tinuously at at least one point of CP! (a Kleinian group) and which moreever 
is free with n generators and has no unipotent elements in it, then according 
to a theorem of Maskit (see in [Mum72]), I’ is a so-called Schottky group, i.e. 
if we denote by Qp the domain of discontinuity of I acting on P'(C) then Qp 
is connected and up to a homeomorphism we get a uniformization 


(H!UQr) = (HW UNr)/l = solid torus with n handles 
U U homeo U 
Qrp 5S Op/P =~ { boundary, a surface of genus n}. 
homeo 


The quotient 

Xo = Op / LP (8.1.2) 
is a Riemann surface of genus g and the covering 2r — Xj c is called a 
Schottky uniformization of X/c. Every complex Riemann surface X/¢ admits 
a Schottky uniformization. In particular, P/Qp is a compact Riemann surface 
of genus n and for a covering corresponding to the subgroup 


Nc m(2r)/LP 


N = least normal subgroup containing a1,--- , Gn. 


The Schottky uniformization 7 admits a p-adic analog. 

In the third case, let TC PGL(2, K) be any discrete subgroup consisting 
entirely of hyperbolic elements. Then according to Ihara, the group I is free: 
let I’ have n generators. Again, let Qp be the set of closed points of Pi where 
I acts discontinuously (equivalently, Qp is the set of points which are not 
limits of fixed points of elements of I’). 

Then it was proved in [Mum72] that there is a curve C of genus n and a 
holomorphic isomorphism: 

mt: Qp/TSC. 
Moreover, A/T’ has a very nice interpretation as a graph of the specialization 
of C' over the ring Ox. In fact 


a) there will be a smallest subgraph 
(A)o/L£ Cc A/T 
such that 
m™((A)o/I) > m(A/T) and (A)o/T will be finite. 
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Fig. 8.4. Ends of A/I (A/T)o 


b) C will have a canonical specialization C over Ox where C is a singular 
curve of arithmetic genus n made up from copies of P;, with a finite number 
of distinct pairs of k-rational points identified to form ordinary double 
points points. Such a curve C will be called a k-split degenerate curve of 
genus n. 

c) C(K), the set of K-rational points of C, will be naturally isomorphic 
to the set of ends of A/I’; C(k), the set of k-rational points of C, will 
be naturally isomorphic to the set of edges of A/I’ that meet vertices of 
(A/T’)o (so that the components of C correspond to the edges of A/I° 
meeting a fixed vertex of (A/I")g and the double points of C correspoind 
to the edges of (A/I°)o; and finally the specialization map 


C(K) — C(k) 


is equal, under the above identification, to the map 


Ends of (A/I’)g > ( edges of A/T ) 


meeting (A/I’)o 


which takes an end to the last edge in the shortest path from that end to 
(A/T )o. 


Example 8.1. The following figure 8.5 illustrates a case when the genus is 2, 
C has 2 components, each with one double point and meeting each other 
once: Because all the curves C' which were constructed in this way have prop- 
erty (b), they are refered to as degenerating curves. The main theorem in 
[Mum72] implies that every such degenerating curve C has a unique analytic 
uniformization 7: \Qp > C. 


8.1.3 Schottky groups and new perspectives in Arakelov geometry 


A unified description of the Archimedean and the totally split degenerate 
fibers of an arithmetic surface was given in [CM], using operator algebras 
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Cc: 
(4/ rh 0; 


Fig. 8.5. 


and Connes’ theory of spectral triples in noncommutative geometry, [Co94]. 
Some of more recent results were reported in [CM04a] on a non-commutative 
interpretation of the totally split degenerate fibers of an arithmetic surface. 


Let X be an arithmetic surface defined over Spec(Z) (or over Spec(Ox), 
for a number field ’), having the smooth algebraic curve X/g as its generic 
fiber. Then, as a Riemann surface, X/c admits always a uniformization by 
means of a Schottky group I’. In analogy to Mumford’s p-adic uniformization 
of algebraic curves (cf. [Mum72]), the Riemann surface X/c can be interpreted 
as the boundary at infinity of a 3-manifold Xp defined as the quotient of the 
real hyperbolic 3-space H’ by the action of the Schottky group I’. The space 
Xr contains in its interior an infinite link of bounded geodesics. 

In [Man91] an expression was given for the Arakelov Green function on X jc 
in terms of configurations of geodesics in Xr, thus interpreting this tangle as 
the dual graph G of the “closed fiber at infinity” of X. 


Schottky uniformization and Schottky groups 


Topologically a compact Riemann surface X of genus g is obtained by gluing 
the sides of a 4g-gon. Correspondingly, the fundamental group has a presen- 
tation 


™7(X) = (a1,°° i Ag, 61,° is 1 bg |] [lai 6] = 1), 


where the generators a; and 0b; label the sides of the polygon. 

In the genus g = 1 case, the parallelogram is the fundamental domain of 
the 7,(X) = Z? action on the plane C, so that X = C/(Z + Zr) is an elliptic 
curve. 

For the genus at least g > 2 the hyperbolic plane H admits a tesselation 
by regular 4g-gons, and the action of the fundamental group by deck trans- 
formation is realized by the action of a subgroup 7(X) = G C PSL(2,R) by 
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isometries of H. This endows the compact Riemann surface X with a hyper- 
bolic metric and a Fuchsian uniformization 


X=G\H. 


Another, less known, type of uniformization of compact Riemann surfaces is 
Schottky uniformization. 

Let us recall briefly some general facts on Schottky groups. 

A Schottky group of rank g > 1 is a discrete subgroup Cc PSL(2,C), 
which is purely loxodromic and isomorphic to a free group of rank g. The 
group PSL(2,C) acts on P!(C) by fractional linear transformation 


az+b 
cz+d 


YizZe 


Thus, I also acts on P!(C). 

Let us denote by Ar the limit set of the action of I’. One sees that 
Ar is contained in P'(C). This set can also be described as the closure 
of the set of the attractive and repelling fixed points z*(g) of the loxo- 
dromic elements g € I’. In the case g = 1 the limit set consists of two 
points, but for g > 2 the limit set is usually a fractal of some Hausdorff 
dimension 0 < 6 = dimy(Ar) < 2 (cf. e.g. Fig.8.6 reproduced here from 
[MSW02] and http://klein.math.okstate.edu/IndrasPearls/gallery 
/GeneralSchottky. gif with a kind permission of David J. Wright and Cam- 
bridge UP). 


Fig. 8.6. Limit set of a Schottky group 


Recall that the notion of the Hausdorff dimension can be used for quite 
general subsets of A C R%, and it gives for the curves and surfaces the usual 
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notion of dimension. The Hausdorff dimension (cf. e.g. [Arne04]*)) could be 
non integer for fractal subsets. 


We denote by Qr = 2r(C) the domain of discontinuity of I’, that is, the 
complement of Ap in P!(C). 
The quotient space Xp := H’/TI is topologically a handlebody of genus gq, 
and the quotient 
Hee 


is a Riemann surface of genus g. The covering Qr — X jc is called a Schottky 
uniformization of Xc. Every complex Riemann surface X¢ admits a Schot- 
tky uniformization. The handlebody Xr can be compactified by adding the 
conformal boundary at infinity Xc¢ to obtain Xp := XpU Xc = (WU Nr)/T. 

Let {7;}92, be a set of generators of the Schottky group I’. Let us use 
the notation y49 := yj", fori =1,...,g. There are 2g Jordan curves C;, on 
the sphere P!(C), with pairwise disjoint interiors Dz, such that the elements 
Ye are given by fractional linear transformations that map the interior of C;, 
to the exterior of C; with |i — j| = g. The curves C; give a marking of the 
Schottky group. The markings are circles in the case of classical Schottky 
groups. A fundamental domain for the action of a classical Schottky group I’ 
on P!(C) is the region exterior to 2g-circles. (cf. Fig.8.7 reproduced here from 
the lectures [Mar04]): 


Fig. 8.7. Schottky uniformization for g = 2 


* A family of R = {Bi}ien of subsets By C R? is an e-covering of A if A C 


UB; and Vi, diam(Bi) < e. First define MS = 7 ae ; (diamB;)°, where the 


infimum is taken over all e-coverings. The Hausdorff dimension is then defined by 
dimy(A) = sup{6 : limso M2 = +00}. One verfies easily that the dimension of 
a regular curve is 1, and the dimension of a regular surface is 2. An example of 
a non-integer dimension is given by the three-adic Cantor set consisting of real 


numbers of the form S- = with €, € {0,2}, whose Hausdorff dimension is equal 
n=1 


to log 2/log 3. 
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Fuchsian and Schottky uniformization. 


Notice that, unlike Fuchsian uniformization, where the covering H is the uni- 
versal cover, in the case of Schottky uniformization Qp is very far from being 
simply connected, in fact it is the complement of a Cantor set. 

A relation between Schottky and Fuchsian uniformizations is given by pass- 
ing to the covering that corresponds to the normal subgroup N(ai,--- ,@g) of 
m™1(X) generated by half the generators {a1,--- ,ag}: 


re ™(X)/N(a1,--- 1g). 


Surface with boundary: simultaneous uniformization 


In order to see better the Schottky uniformization, one can relate it to a 
simultaneous uniformization of the upper and lower half planes that yelds to 
Riemann surfaces with boundary, joint at the boundary. 

A Schottky group that is specified by real parameters so that it lies in 
PSL(2,R), is called Fuchsian Schottky group (cf. Fig.8.8 reproduced from the 
lectures [Mar04], Fig.3. Viewed as a group of isometries of the hyperbolic plane 


Fig. 8.8. Classical and Fuchsian Schottky groups 


H, or equivalently of the Poincaré disk, a Fuchsian Schottky group G produces 
a quotient G'\H which is topoligally a Riemann surface with a boundary. 

A quasi-circle for I’ is a Jordan curve C in P1(C) which is invariant under 
the action of I’. In particular, such curve contains the limit set Ap. The 
existence of a quasi-circle for a Riemann surface X(C) of genus g > 2 is 
known due to Bowen, cf. [Bo], [Mar04]. 

We have that P!(C)\C = 2,U a, and, for tp : Qr — P1(C), the covering 
map 

C=ap(CNQr) CPC) 


is a set of curves on X(C) that disconnect the Riemann surface in the union 
of two surfaces with boundary, uniformized respectively by (2, and (2. 
There exist conformal maps 


a4 1 Q; > U;,U; UU2 = P*(C)\P!(R) (8.1.5) 
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with U; = H=upper half planes in P!(C), and with 
Gi = {axya5? lvefber 
Fuchsian Schottky groups G; C PSL(2,R). Here I is the I-stabilizer of each 


of two connected components in P!(C)\C. 
The compact Riemann surface is then obtained as 


X(C) = Xy ax, =6=aXx2 Xo, 


with X; = U;/G; (Riemann surfaces with boundary C (cf. Fig.8.9 reproduced 
here from the lectures [Mar04], Fig.4). 


Fig. 8.9. Fuchsian Schottky groups: Riemann surfaces with boundary 


In the case where X(C) has a real structure 1: X — X, and the fixed point 
set Fia(t) = X(R) of the involution is nonempty, we have in fact C = X(R), 
and the quasi-circle is given by P!(R). 


8.1.4 Hyperbolic handlebodies 


The action of a rank g Schottky group [ Cc PSL(2,C) on P!(C), by frac- 
tional linear transformations, extends to an action by isometries on H?. For a 
classical Schottky group, a fundamental domain in H? is given by the region 
external to 2g half spheres over the circles C;, C P'(C) (cf. Fig.8.10 reproduced 
here from the lectures [Mar04], Fig.5). 
The quotient 
Xp (8.1.6) 


is topologically a handlebody of genus g filling the Riemann surface X(C) (cf. 
Fig.8.11 reproduced here from the lectures [Mar04], Fig.6). 

Metrically, Xp is a real hyperbolic 3-manifold of infinite volume, having 
X(C) as its conformal boundary at infinity X(C) = 0Xr. 
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Fig. 8.10. Genus two: fundamental domain in H® 


a aes Oe 
ane _A i> 
a 
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Fig. 8.11. Handlebody of genus two: fundamental domains in H? 


We denote by Xp the compactification obtained by adding the conformal 
boundary at infinity, _ 
Xr = (HP UN;)/T. (8.1.7) 


In the genus zero case, we just have the sphere P!(C) as the conformal 
boundary at infinity of H®, thought of as the unit ball in the Poincaré model. 
In the genus one case we have a solid torus H?/g”, for q € C* acting as 


a(z,y) = (42, laly) 


in the upper half space model, with conformal boundary at infinity the Jacobi 
uniformized elliptic curve C*/q2. 

In this case, the limit set consists of the point {0,co}, the domain of 
discontinuity is C* and a fundamental domain is the annulus {|q| < |z| < 1} 
(exterior of two circles). 

The relation of Schottky uniformization to the usual Euclidean uniformiza- 
tion of complex tori X = C/(Z + 7Z) is is given by g = exp(27ir). 

In the case g > 2, the limit set Ap is a Cantor set with an interesting 
dynamics of the action of I’. It is the dynamics of the Schottky group on its 
limit set that generates an interesting noncommutative space. 
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Geodesics in Xr 


The hyperbolic handlebody Xp has infinite volume, but it contains a region of 
finite volume, which is a deformation retract of Xp. This is called the “convex 
core” of Xp and is obtained by taking geodesic hull of the limit set Ap in H? 
and then the quotient by I. 

We identify different classes of infinite geodesics in Xp: 


Closed geodesics: since I is purely loxodromic, for all 7 € I’ there exist two 
fixed points {z+} € P!(C). The geodesics in H® U P!(C) with ends in at 
two such points {z+}, for some y € I correspond to closed geodesics in 
the quotient Xp. 

Bounded geodesics: the images in Xp of geodesics in H? UP1(C) having both 
ends on the limit set Ap are geodesics which remain confined within the 
convex core of Xp. 

Unbounded geodesics: these are the geodesics in Xp that eventually wander 
off the convex core towards the conformal boundary X(C) at infinity. They 
correspond to geodesics in H? UP!(C) with at least one end at a point of 
2p. 


In the genus one case, there is a unique primitive closed geodesic, namely the 
image in the quotient of the geodesic in H® connecting 0 and oo. The bounded 
geodesics are corresponding to geodesics in H® originating at 0 or oo. 

The most interesting case is that of genus g > 2, where the bounded geo- 
desics form a complicated tangle inside Xp. Topologically, this is a generalized 
solenoid, namely it is locally the product of a line and a Cantor set. 


8.1.5 Arakelov geometry and hyperbolic geometry 


In this section we describe the result in [Man91] on the relation between 
the Arakelov Green function on a Riemann surface X(C) with Schottky uni- 
formization and geodesics in the 3-dimensional hyperbolic handlebody Xp. 


Arakelov Green function 


Given a divisor A = )°,,mz(x) with support |A| on a compact smooth Rie- 
mann surface X(C), and a choice of a positive real-analytic 2-form du on 
X(C), the Green function g,,4 = ga is a real analytic function on X(C)\|A|, 
uniquely determined by the following conditions 


Laplace equation: ga satisfies 00ga = mi(deg(A)du — 64), with 54 the 6- 
current gto D>. May(z). 

Singularities: if z is a local coordinate in a neighbourhood of x, then ga — 
Mz log |z| is locally real analytic. 

Normalization: g4 satisfies fy gadp = 0. 
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If B = 57, n,(y) is another divisor, such that |A|N|B| = 0, then the expression 
9u(A, B) = 2, NyGu,A = (y) is symmetric and biadditive in A, B. Generally, 
such expressions g, depends on yz, where the choice of js is equivalent to 
the choice of a real analytic Riemann metric on X(C), compatible with the 
complex structure. 

However, in the special case of degree zero divisors, deg A = deg B = 0, 
the g,,(A,B) = g(A, B) are conformal invariants. 

In the case of the Riemann sphere P!(C), if w4 is a meromorphic function 
with Div(wa) = A, we have 


dw 
=log [[ |waly)|"" = Re [ ae (8.1.8) 
YB 


yE|B| 


where yg is a 1-chain with boundary B. 

In the case of degree zero divisors A, B on a Riemann surface of higher 
genus, the formula (8.1.8) can be generalized replacing the logarithmic differ- 
ential He with a differential of the third kind (meromorphic differential with 
nonvanishing residues) w4 with purely imagimary periods and residues m,, at 
xz. This gives 


g(A, B) =Re [ Wa. (8.1.9) 


Thus one can explicitely compute g(A, B) from a basis of differentials of the 
third kind with purely imaginary periods. 


Cross ratio and geodesics 


The basic step in expressing the Arakelov Green function in terms of geodesics 
in the hyperbolic handlebody Xr, is a very simple classical fact of hyperbolic 
geometry, namely the fact that the cross ratio of four points on P!(C) can be 
expressed in terms of geodesics in the interior H®: 


log |(a, b,c, d)| = —ordist(a « {c, d}, b * {c, d}). (8.1.10) 


Here, ordist denotes the oriented distance, and we use the notation a * {c, d} 
to indicate the point on the geodesic {c, d} in H? with endpoints c,d € P'(C), 
obtained as the intersection of {c,d} with the unic geodesic from a that cuts 
{c,d} at a right angle (cf. Fig.8.12 reproduced here from the lectures [Mar04], 
Fig.8). 


Differentials and Schottky uniformization 


The next important step is an explicit construction of a basic of differentials of 
the third kind with purely imaginary periods for a Riemann surface X(C) = 
I\Qr with a Schottky uniformization. This construction uses averages over 
the group I’ of expressions involving the cross ratio 
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/ e’ b* (c,d) \ 
ja. A Sheed 
1 Sseag b] 
\ a*{c,d)} } 


_ (a—b)(e—d) 
(a,b,c, d) := CCE Dea). (8.1.11) 


Let us denote by C(|7) a set of representatives for (p”)\I"/(y“), and by S(¥) 
the conjugacy class of y in I’. 

Let wa be a meromorphic function on P!(C) with divisor A = (a) — 
(b), such that the support |A| is contained in the complement of an open 
neighbourhood of Ar. 

For a fixed choice of a base point z € Qr, the series 


Y(a)—(6) = >, Alog (a,b, 72,7720) (8.1.12) 
yer 


gives the lift to Qr of a differential of the third kind on the Riemann surface 
X(C), endowed with the choice of Schottky uniformization. These differentials 
have residues +1 at the images of a and b in X(C), and they have vanishing 
ay periods, where {ax, bk beat, g are the generators of the homology X(C). 

Similarly, we obtain lifts of differentials of the first kind on X(C) by con- 
sidering the series 


Wy'= ys dlog (hz*(y), hz (7), 2,20) (8.1.13) 
heC(|7) 


where we denote by {zt(y),z~(y)} C Ar the pair of the attractive and re- 
pelling fixed points of y € I. 

The series (8.1.12) and (8.1.13) converge on compact sets K C Qp when- 
ever dimy Ar < 1. Moreover, they do not depend on the choice of the base 
point z € Qrp. 

In particular, given a choice {7,}{_, of generators of the Schottky group 
I’, we obtain by the series (8.1.13) a basis of holomorphic differentials that 
satisfy 


wy, = WevV/—16p- (8.1.14) 
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One can then use a linear combination of the holomorphic differentials wy, to 
correct the meromorphic differentials v(,)_(») in such a way that the resulting 
meromorphic differentials have purely imaginary b,-periods. Let X;(a,b) be 
coefficients such that the differentials of the third kind 


W(a)—(b) = V(a)—(b) aa S- Xi (a, b)w, (8.1.15) 
l 


have purely imaginary periods. The coefficients X;)(a, b) satisfy the system of 
equations 


Y Xia, =Re f Ya)-(6) = >, log|(a,b,2*(h), 27 (h))|- (8.1.16) 


U Pe hES(gx) 


Thus one obtains that the Arakelov Green function for X(C) with Schottky 
uniformization can be computed as 


g((a) — (8), (c) — (d)) = $7 log |(a, b, he, hd)| — 


hel 


S° Xi(a,b) S> log|(2*(h), 27 (h),€,4)).- (8.1.17) 
l=1 


he S(gu) 


Notice that this result seems to indicate that there is a choice of Schottky uni- 
formization involved as additional data for Arakelov geometry at arithmetic 
infinity. However, it was already noticed that the Schottky uniformization is 
determined by the real structure at least for real Archimedean primes. 


Green function and geodesics 


One can explicitely express the Green function in terms of geodesics using the 
formula (8.1.10) together with the obtained expression (8.1.17): 


g((a) — (b), (c) — (d)) = SS ordist(a * {he, hd}, b « {hce, hd}) 


hel 


+S° Xi(a,b) SS ordist(zt (h) « {c,d}, z~ (h), *{c,d}). (8.1.18) 
i=1 


he S(gu) 


The coefficient X;(a,b) can also be expressed in terms of geodesics, using the 
equation (8.1.16). 


8.2 Cohomological Constructions, Archimedean 
Frobenius and Regularized Determinants 


8.2.1 Archimedean cohomology 


Given Deninger’s formulae (III.3.12) and (III.3.13) as above, it is natural to 
ask for a cohomological interpretation of the data (H™,®) (see (III.3.14)). A 
general answer was found by C.Consani in [Cons98], for general arithmetic 
varieties (in any dimension), giving a cohomological interpretation of the pair 
(H™, ®) on Deninger’s calculation of the Archimedean L-factors as regularized 
determinants. 

Her construction was motivated by the analogy between geometry at arith- 
metic infinity and the classical geometry of a degeneration over a disk. She 
introduced a double complex of differential forms with an endomorphism N 
representong the “logarithm of the monodromy” around the special fiber at 
arithmetic infinity, which is modelled on (a resolution of) the complex of 
nearby cycles in the geometric case. The definition of the complex of nearby 
cycles and of its resolution, on which the following construction is modelled is 
rather technical. What is easier to visualize geometrically is the related com- 
plex of vanishing cycles of a geometric degeneration (see Fig.8.13 reproduced 
here from the lectures [Mar04], Fig.15). 


Fig. 8.13. Vanishing cycles 


Let us describe here the construction of [Cons98] (see also [CM04b]). One 
constructs the cohomology theory underlying the data (III.3.14) in several 
steps. 

Let X = X(C) be a complex Kahler manifold. 
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Step 1: Consider first a doubly infinite graded complex 
C =2(X)@ClU,U|] @ Ch, h““], (8.2.1) 


where 92’ is the de Rham complex of differential forms on X, while U and 
h are formal variables, with U of degree two and fh of degree zero. 
Let us consider on this complex differentials 


c= hid, d'o := V-1(0 - 9), (8.2.2) 


with total differential dc = dQ 4+ d'c. 
We also have an inner product 


(a@U" @h*, B@U* @h') = (a, 0) 5r, 50k, t (8.2.3) 
where (a, 7) is the usual Hodge inner product of forms, 
(an) = f ant eq, (8.2.4) 
with C(n) = (/—-1)?~4, for n € QP-4. 
Step 2: Let us use the Hodge filtration 
a @p'+-qg=m, p/>pO” U(X) (8.2.5) 
to define linear subspaces of the complex (8.2.1) of the form 
emer — PB Prete k OMX 3 Ue he (8.2.6) 
‘Kreg 


and the Z-graded vector space 


e= @ er, (8.2.7) 


-=2r+m 


Step 3: Let us pass to a real vector space by considering 
ar a (8.2.8) 


where c denotes complex conjugation. 
In terms of the intersection of the Hodge filtrations 


y=FOF (8.2.9) 


hence 
T= @ gx, (8.2.10) 


-=2r+m 


where 
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gman — ‘ap ert hOM XY OUT Oh (8.2.11) 
Dente es 

The Z-graded complex vector space €' is a subcomplex of C" with respect 
to the differential dj and for P+ the orthogonal projection onto © in 
the inner product (8.2.8), one obtains a second differential d" = Ptd'c. 
Similarly, d’ = di, and d" = P+d"c define differentials on the Z-graded 
real vector space Y in terms of the corresponding cutoffs on the indices 
of the complex C’. For 


Apq = {(7,k) € Z*? | k > «(p,q,r)} (8.2.12) 


with 


(8.2.13) 


pon 
2 


K(p,q,7) = max {0,2r Em, 


(see Fig.8.14 reproduced here from the lectures [Mar04], Fig.16), we iden- 


29-2k+ [p-ol+m=0 


Ve k=0 


Fig. 8.14. Cutoffs defining the complex at arithmetic infinity 


tify © as a real vector space with the span 
T =R(a@u" @h*) (8.2.14) 


where (r,k) € Ap for a = € + € with € € 24. 


Operators 


The complex ({,6) has interesting structures given by the action of certain 
linear operators. 
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We have the operators N and @ that correspond to the “logarithm of the 
monodromy” and the “logarithm of Frobenius”. These are of the form 


O 
N = Uh, & = —U — 8.2.15 
b= US, (8.2.15) 
and they satisfy [N,d’] = [N,d"|] = 0 and [@,d’] = [@,d’] = 0, hence they 
induce operators in cohomology. 
Moreover, there is another important operator, which corresponds to the 
Lefschetz operator on forms, 


L:7@U" @h* 4 nAw@U" '@h*, (8.2.16) 


where w is the Kahler form on the manifold X. This satisfies [L, d’] = [L, d’] = 
0, and also descends on the cohomology. 

The pairs of operators N and @ or L and © satisfy interesting commutation 
relations 


[6, N] = —N,[@,L] =,L 


that can be viewed as an action of the ring of differential operators 


CIP, Q]/(PQ — QP = Q). 


SL(2,R) representations 


Another important ingredient of the structure of the complex (T',6) are two 
involutions 


S:a@U" QRH, a@u-"t™ Qh! (8.2.17) 
5: a@U" @AE KG C(xa) @ UT —™ @ hk (8.2.18) 


These maps, together with the nilpotent operators N and L define two repre- 
sentations, o” and o”, of the group SL(2, R), given explicitely on the following 


generators 
s 0 Z 
v(s) = i ,seER 
Gate ee ie (8.2.19) 
u(t) = 2. 
01)’ 
_ fol 
seme ee 
namely 


o(u(t)) = exp(tL), 0” (u(t)) = exp(tN), (8.2.20) 
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Of these representations, 0” extends to an action by bounded operators on 
the Hilbert completion of T° in the inner product (8.2.4), while the action of 
the subgroup v(s), s € R* of SL(2,R) via the representation o” on the Hilbert 
space is by unbounded densly defined operators. 


8.2.2 Local factor and Archimedean cohomology 


C.Consani showed in [Cons98] that the data (H™,®) of (III.3.14) can be 
identified with 
(H(z, 6)*~°, 6), 


where (H(, 6) is the hypercohomology (the cohomology with respect to the 
total differential) of the complex &', and H(T',5)%~° is the kernel of the map 
induced by N on cohomology. The operator @ is the one induced on coho- 
mology by that of (8.2.15). She called H™ ~ H(I,6)N=° the Archimedean 
cohomology. It was also shown in [Cons98], that this cohomology groups can 
be obtained as a piece of the cohomology of the cone of the monodromy JN. 
This is the complex 

Cone(NV) = FT 6 T [41] (8.2.21) 


o-(¢2) 


The complex (8.2.21 ) inherits a positive definite inner product from TY’, which 
descends to cohomology. The representation o” of SL(2,R) on ©’ induces a 
representation on Cone(N)’. The corresponding infinitesimal representation 
do” : g — End(’) of the Lie algebra g = sI(2,IR) extends to a representation 
of the universal enveloping algebra U(g) on Y and to a representation on 
Cone(N)'. This gives a representation in the algebra of bounded operators on 
the Hilbert completion of Cone(N) under the inner product. 


with differential 


Theorem 8.2. The triple 
(A, H, D) = (U(g), H (Cone(N)’), ®) 


has the properties that D = D*,and that (1+ DAre is a compact operator. 
The commutators [D, a] are bounded operators for alla € U(g), and the triple 
is 1*-summable. 


Thus, (A,H,D) has most of the properties of a spectral triple confirming 
the fact that the logarithm of Frobenius ® should be thought of as a Dirac 
operator. However, we are not dealing here with an involutive subalgebra of 
a C*-algebra. 

In any case, the structure is sufficient to consider zeta functions for this 
“spectral triple”. In particular, we can recover the alternating products of the 
local L-factors at infinity from a zeta function of the spectral triple. 
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Theorem 8.3. Consider the zeta-function 
Ca,o(z) = Tr(a|®|"*) 


with a= o'(w). This gives 


o,oL(w),& 


det (s)= I L(H™(X), 8)" 
m=0 


8.2.3 Cohomological constructions 


In the rest of the section we give, following [CM], some explanantions of a 
cohomological theory for the Archimedean fiber of an Arakelov surface. How- 
ever, there is no need to restrict to the case dim X = 1, because the general 
theory, valid for any arithmetic variety, was developed in [Cons98] (see also 
an interesting discussion of the Archimedean cohomology in [Mar04], Chapter 
3). This construction provides an alternative definition and a refinement for 
the Archimedean cohomology H;,. introduced by Deninger in [Den91]. 

The cohomology spaces H’(X*) are infinite dimensional real vector spaces 
endowed with a monodromy operator N and an endomorphism &. The coho- 
mology group H’(X*) is in fact the same as H(T’,6)|v=(ni)-1 in the pre- 
vious sections. The groups Hj, can be identified with the subspace of the 
N-invariants (i.e. Ker(N)) over which (the restriction of) acts in the fol- 
lowing way. The monodromy operator determines an integer, even graduation 
on 

: * Ww : * 
Hf (x )= @ GapH (Xx I 
where each graded piece is still infinite dimensional. We will refer to it as to 
the weight graduation. This graduation induces a corresponding one on the 
subspace 
H (X*)"= = @ GT 3pH (X*). 

“22p 
The summands gr$,,H ‘(X*) are finite dimensional real vector spaces on which 
@ acts as a multiplication by the weight p. 

When X/, is a non-singular, projective curve defined over & = C or R, 
the description of gr$,,H (X*) (- = 2p) is particularly easy. For k = C, this 
space coincides with the de Rham cohomology H5, p(X c,R) of the Riemann 
surface X/c. 

A motivation for the definition of these complexes comes from the clas- 
sical theory of mixed Hodge structures for an algebraic degeneration over a 
disk (and its arithmetical counterpart, the theory of Frobenius weights). The 
notation: H’(X*), H’(X*), H’(Y) followed in this section is purely formal. 
Namely, X*, X* and Y are only symbols although this choice is motivated by 
the analogy with Steenbrink’s construction in [Stee76], in which X*, X* and 
Y describe resp. the geometric generic fiber and the complement of the special 
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fiber Y in the model. The space H’(X™*) is the hypercohomology group of a 
double complex kK’ of real, differential twisted forms on which one defines an 
additional structure of polarized Lefschetz module. 

The whole theory is inspired by the expectation that the fibers at infin- 
ity of an arithmetic variety should be thought to be semi-stable and more 
specifically to be “maximally degenerate or totally split”, cf. a discussion in 
SITT.2.3. It would natural to think that the construction of the complex kK” on 
the Riemann surface X/,,, whose structure and behavior gives the arithmeti- 
cal information related to the “mysterious” fibers at infinity of an arithmetic 
surface, fits in with Arakelov’s intuition that Hermitian geometry on X/,, is 
enough to recover the intersection geometry on the fibers at infinity. 


8.2.4 Zeta function of the special fiber and Reidemeister torsion 


In this paragraph we explain, following §3.5 of [CM] that in the case dim X = 
1, the expression of Theorem 8.3 can be interpreted as a Reidemeister torsion, 
and it is related to a zeta function for the fiber at arithmetic infinity. 

We begin by giving the definition of a zeta function of the special fiber 
of a semistable fibration, which motivates the analogous notion at arithmetic 
infinity. 

Let X be a regular, proper and flat scheme over Spec(A), for A a discrete 
valuation ring with quotient field K and finite residue field k. Assume that 
X has geometrically reduced, connected and one-dimensional fibers. Let us 
denote by 7 and v resp. the generic and the closed point of Spec(A) and by 7 
and U the corresponding geometric points. Assume that the special fiber X, of 
X is a connected, effective Cartier divisor with reduced normal crossings de- 
fined over k = k(v). This degeneration is sometime referred to as a semistable 
fibration over Spec( A). 

Let N, denote the cardinality of k. Then, define the zeta-function of the 
special fiber X, as follows (u is an indeterminate) 


P,(u) 


= Dy/,\p/, ;(u) = de — f*u 7 = Is 2. 
= Buoy? i) = det — fru | H*(%q,Qe)"), (8.2.22) 


Zx,,(u) 
where f* is the geometric Frobenius i.e. the map induced by the Frobenius 
morphism f : X; — X,% on the cohomological inertia-invariants at U. 

The polynomials P;(u) are closely related to the characteristic polynomials 
of the Frobenius F;(u) = det(u-1— f* | H*(X;,Qv)/”) through the formula 
P,(u) = uw F,(u7}), b; = degree(F;). (8.2.23) 


The zeta function Zx, (wu) generalizes on a semistable fiber the description 
of the Hasse-Weil zeta function of a smooth, projective curve over a finite 
field. 
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Based on this construction we make the following definition for the fiber 
at an Archimedean prime of an arithmetic surface: 


Zo(u) = & = nara (8.2.24) 
where we set 
P,(u) = det (= - 54) ; (8.2.25) 


with ®, = P| ya X+)N=0- 


In order to see how this is related to the result of Theorem 8.3, we re- 
call briefly a simple observation of Milnor (cf. §3 [Mil68]). Suppose given a 
finite complex L and an infinite cyclic covering L, with H,(L, x) finitely gen- 
erated over the coefficient field «. Let h : mL — «(s) be the composition 
of the homomorphism 7,L — IT associated to the cover with the inclusion 
IT C Units(«(s)). The Reidemeister torsion for this covering is given (up to 
multiplication by a unit of «/7) by the alternating product of the characteristic 
polynomials F,(s) of the «linear map 


8, : Hy(L,%) > H,(L,k), 


T(s) © Fo(s)F,(s)~Fo(s)--- Fy(s)*!. (8.2.26) 
Moreover, for a map T': L > L, let Cr(u) be the Weil zeta 


Cr(u) = Po(u)71P,(u)P2(u)7! + Pr(u)*!, 


where the polynomials P,(u) of the map T, are related to the characteristic 
polynomials F,(s) by (8.2.23) and 6, are the q-the Betti number of the complex 
L. By analogy with (8.2.26), Milnor writes the Reidemeister torsion Tr(s) (up 
to multiplication by a unit) as 


(8) := Fo(s)F,(s)~}Fo(s) --- Fy(s)*', 


where F,(s) are the characteristic polynomials of the map T,. Then the rela- 
tion between zeta function and Reidemeister torsion is given by: 


Cr(s~!)rr(s) = 8), (8.2.27) 
where y(Z) is the Euler characteristic of L. 


Similarly, we can derive the relation between the zeta function of the fiber 
at infinity defined as in (8.2.24) and the alternating product of Gamma factors. 
Namely, we write 

Le(H*(X/c,C), 8) Fo(s) - Fo(s) 


Lc(H°(X/c,C), s) - Le(H?(X7c,C), s) > ay (8.2.28) 
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where we set 


p 
F;,(8) := detec (= oe st) (8.2.29) 


with Bj = 9 p74(x+)n=0- For this reason we may regard (8.2.28) as the Reide- 
meister torsion of the fiber at arithmetic infinity: 


Pre ee. (8.2.30) 


The relation between zeta function and Reidemeister torsion is then given 
as follows. 


Proposition 8.4. The zeta function Z@ of (8.2.24) and the Reidemeister tor- 
sion Te of (8.2.30) are related by 


Za(s“!)ro(s) = 9 Pex sl86, 


with g is the genus of the Riemann surface X;c and x = 2 — 2g its Euler 
characteristic. 


Indeed, the result follows by a simple direct calculation of the regularized 
determinants. Namely, we compute (in the case g = 0, 1) 


= 1 Py\ _ d : Ls 
P,(u) = detgo (5 - St) = exp by = (27) pe! +un)~*)|,—0 


1 log 2 lo lo 
= exp (05 (tour (<>) + Bo +e + *)) 


log u 


© (Qn) V"P(1/u), 


— qbelte—ba 


where bg are the Betti numbers of X/c. The case q = 2 is analogous, but for 
the presence of the +1 eigenvalue in the spectrum of ®2, hence we obtain 


P(u) = exp (-»£ (enrwrscyis =: (- _ )) 


1 a 
=Ic ( = i) ule, 


u 
Thus, we obtain 
Lce(H°(Xjc,C), 8): Le(H?(Xc,C), 8) 


Z 1) _ ’ g—-2 x slog s” 
ae Te(H™(Xje,C),8) —" 


8.3 Spectral Triples, Dynamics and Zeta Functions 


We have seen that in the Arakelov theory a completion of an arithmetic surface 
is achieved by enlarging the group of divisors by formal linear combinations 
of the “closed fibers at infinity”. For an arithmetic surface these fibers were 
described in [Man91] as follows: the dual graph of any such closed fiber has 
the form of an infinite tangle of bounded geodesics in a hyperbolic handlebody 
endowed with a Schottky uniformization. 

This Section is based on Sections 4, 5 and 6 of [CM], and on [CM04a]. We 
describe an alternative construction of cohomological spectral data (A, H, ©), 
which is related to description in [Man91] of the dual graph of the fiber at 
infinity. We use a geometric model for the dual graph as the mapping torus 
of a dynamical system T on a Cantor set. We consider a noncommutative 
space which describes the action of the Schottky group on its limit set and 
parameterizes the “components of the closed fiber at infinity”. This can be 
identified with a Cuntz—Krieger algebra O.4. We describe a spectral triple for 
this noncommutative space, via a representation on the cochains of a “dynam- 
ical cohomology”, defined in terms of the tangle of bounded geodesics in the 
handlebody. In the same way as for the Archimedean cohomology of the previ- 
ous section (§8.2), the Dirac operator agrees with the grading operator ®, that 
represents the “logarithm of a Frobenius—type operator” on the Archimedean 
cohomology. In fact, the Archimedean cohomology embeds in the dynamical 
cohomology, compatibly with the action of a real Frobenius F,,, so that the 
local factor can again be recovered from these data. The duality isomorphism 
on the cohomology of the cone of N corresponds to the pairing of dynamical 
homology and cohomology. This suggests the existence of a duality between 
the monodromy N and the dynamical map 1 — T. 

In noncommutative geometry, the notion of a spectral triple provides the 
correct generalization of the classical structure of a Riemannian manifold. The 
two notions agree on a commutative space. In the usual context of Riemannian 
geometry, the definition of the infinitesimal element ds on a smooth spin man- 
ifold can be expressed in terms of the inverse of the classical Dirac operator 
D. This is the key remark that motivates the theory of spectral triples. In par- 
ticular, the geodesic distance between two points on the manifold is defined 
in terms of D~! (cf. [Co94] §VI). The spectral triple that describes a classical 
Riemannian spin manifold is (A, H,D), where A is the algebra of complex 
valued smooth functions on the manifold, H is the Hilbert space of square in- 
tegrable spinor sections, and D is the classical Dirac operator (a square root 
of the Laplacian). These data determine completely and uniquely the Rie- 
mannian geometry on the manifold. It turns out that, when expressed in this 
form, the notion of spectral triple extends to more general non-commutative 
spaces, where the data (A, H, D) consist of a C*-algebra A (or more generally 
of a smooth subalgebra of a C*-algebra) with a representation as bounded 
operators on a Hilbert space H, and an operator D on H that verifies the 
main properties of a Dirac operator. The notion of smoothness is determined 
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by D: the smooth elements of A are defined by the intersection of domains of 
powers of the derivation given by commutator with |D|. The basic geometric 
structure encoded by the theory of spectral triples is Riemannian geometry, 
but in more refined cases, such as Kahler geometry, the additional structure 
can be easily encoded as additional symmetries. 


In the constructions of this chapter, the Dirac operator D is obtained 
from the grading operator associated to a filtration on the cochains of the 
complex that computes the dynamical cohomology. The induced operator on 
the subspace identified with the Archimedean cohomology agrees with the 
“logarithm of Frobenius” of [Cons98] and [Den91]. 

This structure further enriches the geometric interpretation of the Archime- 
dean cohomology, giving it the meaning of spinors on a noncommutative man- 
ifold, with the logarithm of Frobenius introduced in [Den91] in the role of the 
Dirac operator. 

An advantage of this construction is that a completely analogous formula- 
tion exists in the case of Mumford curves. This provides a unified description 
of the Archimedean and totally split degenerate fibers of an arithmetic surface. 


Let X be an arithmetic surface defined over Spec(Z) (or over Spec(Ox), 
for a number field K), having the smooth algebraic curve X7/g as its generic 
fiber. Let p be a finite prime where X has totally split degenerate reduction. 
Then, the completion Xe at p of the generic fiber of X is a split-degenerate 
stable curve over Q, (also called a Mumford curve) uniformized by the action 
of a p-adic Schottky group I’. The dual graph of the reduction of a coincides 
with a finite graph obtained as the quotient of a tree Ar by the action of I’. 

The curve Xs is holomorphically isomorphic to a quotient of a subset of 
the ends of the Bruhat-Tits tree A of Q, by the action of I’. Thus, in this 
setting, the Bruhat-Tits tree at p replaces the hyperbolic space Hl’ “at infinity”, 
and the analog of the tangle of bounded geodesics in Xr is played by doubly 
infinite walks in Ap/T. 

In analogy with the Archimedean construction, the dynamical system 
(W(Ar/TL),T) is described in Section 8.3.8, where T is an invertible shift 
map on the set W(Ar/TI’) of doubly-infinite walks on the graph Ar/I’. The 
first cohomology group H!(W(Ar/T)+r, Z) of the mapping torus W(Ar/T)r 
of T inherits a natural filtration using which a dynamical cohomology group 
was introduced. One has in then a Cuntz-Krieger graph algebra C*(Ar/TI) 
and we can construct a spectral triple as in the case at infinity. The Dirac 
operator is related to the grading operator ® that computes the local factor as 
a regularized determinant, as in [Den01], [Den94]. In [CM03], a possible way 
was suggested of extending such construction to places that are not of split 
degenerate reduction, inspired by the “foam space” construction of [Man72b] 
and [CM04a]. Notice however, unlike the local factor at infinity, the factor at 
the non-Archimedean places involves the full spectrum of D and not just its 
positive or negative part. It is believed that this difference should correspond 
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to the presence of an underlying geometric space based on loop geometry, 
which manifests itself as loops at the non-Archimedean places and as “half 
loops” (holomorphic disks) at arithmetic infinity. 

There is another important difference between the Archimedean and non- 
Archimedean cases. At the Archimedean prime the local factor is described 
in terms of zeta functions for a Dirac operator D (cf. [CM], [Den91]). On 
the other hand, at the non-Archimedean places, in order to get the correct 
normalization as in [Den94], we need to introduce a rotation of the Dirac 
operator by the imaginary unit, D> iD. This rotation corresponds to the so 
called Wick rotation that moves poles on the real line to poles on the imaginary 
line (zeroes for the local factor) and appears to be a manifestation of a rotation 
from Minkowskian to Euclidean signature it + t, as already remarked in 
([Man95], p.135): “imaginary time motion” may be held responsible for the fact 
that zeroes of I'(s)~+ are purely real whereas the zeroes of all non-Archimedean 
Euler factors are purely imaginary. It is expected, therefore, that a more 
refined construction would involve a version of spectral triples for Minkowskian 
signature*). 


8.3.1 A dynamical theory at infinity 


Let us describe some dynamical theory tools used in constructions of Deninger- 
style cohomology spaces at arithmetical infinity. We explain that these spaces 
can be used in order to describe Gamma-factors of an arithmetic surface as 
certain zeta-reguarized determinants and a “reduction modulo oo” by means 
of Noncommutative geometry (the theory of spectral triples) in Section 8.4. 


Since the uniformizing group I is a free group, there is a simple way of 
obtaining a coding of the bounded geodesics in the handlebody Xr (defined 
by (8.1.6)) The set of such geodesic can be identified with Ar xr Ar, by 
specifying the endpoints in H® U P!(C) modulo the action of I’. 

Given a choice of a generators {y;}%_, for I’, there is a bijection between 
the elements of I’ and the set of all admissible walks in the Cayley graph of I, 
namely reduced words in the {ys}29 ,, Where we use the notation y+, := 7°, 
fori =1,...,g. 

In the following we consider the sets St and S of resp. right-infinite, doubly 


infinite admissible sequences in the {y}7%,: 
St= {agai 8 OB 5% |a, E {y}?9,, Qj41 x az, Vi E N}, (8.3.1) 
ge (8.3.2) 


{. --A-m.---A-1d0Qa1...ae... \a; E {yi}24,, Qj4+1 x a,*,Vi € Z}. 


According to B.Mazur [Maz2000], “Archimedean and non-Archimedean phenom- 
ena remain, at bottom, puzzingly different. Perhaps the next century will see a 
more profound understanding of the relation between these...” 
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The admissibility condition simply means that we only allow “reduced” words 
in the generators, without cancellations. 

On the space S we consider the topology generated by the sets W*(a, @) = 
{y € §|x~ = yx, k > Cf, and W" (x, 2) = {y € Slax = yx, k < £} for x € S and 
CEZ. 

There is a two-sided shift operator T acting on S as the map 


T(... Q@om ---G-109@1... a ...)= 


(8.3.3) 
++» A-m4ti--- Gg A, a2... A411... 


Then we can pass from the descrete dynamical system (S,T’) to its sus- 
pension flow and obtain the mapping torus. 

The following topological space is defined in terms of the Smale space 
(S,T) and will be considered as a geometric realization of the “dual graph” 
associated to the fiber at arithmetic infinity of the arithmetic surface X. 


Definition 8.5. The mapping torus (suspension flow) of the dynamical sys- 
tem (S,T) is defined as 


Sp := 8S x [0,1]/(x,0) ~ (Tx, 1) (8.3.4) 


Topologically, this space is a solenoid, that is, a bundle over S$! with fiber 
a Cantor set. 


8.3.2 Homotopy quotion 


The space Sr is a very natural space associated to the noncommutative space 


Ap xp Arp ~S/Z, (8.3.5) 


where Ar is the limit set of the action of I’, and Z is acting in (8.3.5) via the 
invertible shift T of (8.3.3). This space is given by given by the C*-algebra 


A=C(S) xrZ (8.3.6) 


describing the action of the shift T on the totally disconnected space S. One 
obtains the homotopy quotient (cf. [BaCo], [Co83]), 


Sp =S xzR. (8.3.7) 


This is a commutative space that provides, up to homotopy, a geometric model 
for the noncommutative space (8.3.5), where the noncommutative space (8.3.5) 
can be identified as in §8.4.1 with the quotient space of a foliation (8.4.3) whose 
generic leaf is contractible (a copy of R). 

In order to study such spaces, K-—theory tools are used. 

The C*—algebra C(S) is a commutative AF-algebra (approximately finite 
dimensional), obtained as the direct limit of the finite dimensional commuta- 
tive C*-algebras generated by characteristic functions of a covering of S. 
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There is the Pimsner—Voiculescu exact sequence (cf. [PV80]) which has 
the form gat 
Ki(A) — Ko(C(S)) —>* Ko(C(S)) 
(8.3.8) 
Ki(C(S)) — Ki(C(S)) — Kol) 
where A = C(S)™rZ. Here since the space is totally disconnected, Ko(C(S)) = 
C(S,Z), being the direct limit of the Ko-groups of the finite dimensional 
commutative C*-algebras, and Ki(C(S)) = 0 for the same reason (locally 
constant integer valued functions). Then the exact sequence becomes 


0 Ki(C(S) xr Z) > C(S,Z) 75 C(s,Z) 


— Ko(C(S) xr Z) > 0, (8.3.9) 


with Ko(C(S) «7 Z) = C(S,Z)r. Since the shift T is topologically transitive, 
ie. it has a dense orbit, we also have Ki(C(S) x7 Z) = C(S,Z)* & Z. 


In dynamical system language, these are respectively the invariants and 
coinvariants of the invertible shift T (cf. [BoHa], [PaTu82]). 

In terms of the homotopy quotient, one can describe this exact sequence 
more geometrically in terms of the Thom isomorphism and the u-map 


pw: K** (Sp) & H**1(Sp,Z) > K,(C(S) xr Z). (8.3.10) 
Thus we obtain 
K,(A) & H°(Srp) =Z 
Ko(A) © H' (Sr) 


The cohomology group H!(Sr) can be identified with the Cech cocomology 
group given by the homototy classes [S;,U(1)]: there is the isomorphism 


C(S,Z)r = H'(Sr,Z) (8.3.11) 
given explicitly by mapping 

f & lexp(27itf(x))], (8.3.12) 
for f € C(S,Z) and with [-] the homotopy class. 


8.3.3 Filtration 


Now let us consider the cohomology group H'(S;,Z) and recall the following 
combinatorial explicit description of the cohomology of the mapping torus 
H}(Sr,Z). 

There is an identification of H1(S;,Z) with the Ko-group of the crossed 
product C*-algebra for the action of T on S, 


H'(Sp,Z) = Ko(C(S) x7 Z). (8.3.13) 
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Theorem 8.6. The cohomology H'!(Sr,Z) satisfies the following properties. 
The identification (8.3. endows H'(Sr,Z) with a filtration by free abelian 
groups Fy — FY; < o Fy, 3 ---, with rankFy = 2g and rankF,, 
2g(2g — 1)"—1(2g — 2) a, for n> 1, so that 


H'(Sr,Z) = lim Fy. 


n 


In fact, by the Pimsner-Voiculescu six term exact sequence, the group 
Ko(C(S) x7 Z) can be identified with the kernel of he map 1 — T, acting as 
fr f—foT on the Zmodule C(S,Z) = Ko(C(S): 


C(S,Z)r = C(S,Z)/B(S,Z) © P/S8P, (8.3.14) 


where P C C(S,Z) is the set of functions that depend only on “future coor- 
dinates”, and 6 is the operator 6(f) = f — foT. 

This can be identified with functions on the limit set Ap, since each point 
in Ap is described by an infinite admissible sequence in the generators 7; and 
their inverses. 

Then the set of functions P can be identified with 


C(S*,Z) = C(Ap,Z) 


viewed as the submodule of the Z-module C(S,Z) of functions that only 
depend on future coordinates. Thus, P has a filtration P = UP29P,, where 
P,, is generated by the characteristic functions of S*(w) with w of length at 
most n+ 1. Taking into account the relations between these, we obtain that 
Pn is a free abelian group generated by the characteristic functions of St(w) 
with w of length exactly n + 1. The number of such words is 2g(2g — 1)”, 
hence rankP,, = 2g(2g — 1)”. The map 6 satisfies 6 : Pp > Pn4i, with a 
1-dimensional kernel given by the constant functions. The resulting quotients 


Fr = Pn/OPn-1 
are torsion free (cf. Theorem 19 §4 of [PaTu82]) and have ranks 
rankF,, = 2g(2g —1)”~1(2g —2) +1 
for n > 1, while Fo = Po is of rank 2g. There is an injection F, © Fy41 
induced by the inclusion P;, C Py+1, and P/d5P is the direct limit of the F;, 


under these inclusions. Thus we obtain the filtration on H1(Sr, Z): 


H'(Sr,Z) = lim Fy. 


n 
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8.3.4 Hilbert space and grading 
It is convenient to consider the complex vecor space 

Pe =C(Ar,Z)@C 


and the corresponding exact sequence computing the cohomology with com- 
plex coefficients: 


C—Cs 2 Pes (S.C) 36. (8.3.15) 
The complex vector space P¢ sits in the Hilbert space 
Pe CL=L? (Ap, dp), 
where ys is the Patterson-Sullivan measure on the limit set, satisfying 
Op = || dy, 


with dimy(Ar) the Hausdorff dimension. 

This gives a Hilbert space £, together with a filtration P,, by finite di- 
mensional subspaces. In this setting, it is natural to consider a corresponding 
grading operator, 


Bas nies (8.3.16) 
where J7,, denotes the orthogonal projection onto P, and Te = 11,0 Ty -1 


8.3.5 Cuntz—Krieger algebra 


There is a noncommutative space, that encodes nicely the dynamics of the 
Schottky group I’ on its limit set Ap. This space is given by the Cuntz—Krieger 
algebra, which carries a refined information on the action of the Schottky group 
on its limit set. In order to define this algebra, consider the 2g x 2g matrix A 
that gives the admissibility condition for the sequences in S: this is the matrix 
with {0,1} entries satisfying A;; = 1 for |i— j| # g and A;,; = 0 otherwise. 

Recall that a partial isometry is a linear operator S satisfying the relation 
S=SS*S. 

The Cuntz—Krieger algebra O 4 (cf. [Cu], [CukKrie]) is defined as the uni- 


versal C*—algebra generated by partial isometries 5),..., S29, satisfying the 
relations 
5,53 =I (8.3.17) 
a 
S35; =~ Ais 5583. (8.3.18) 


Jj 


This algebra is related to the Schottky group by the following result. 
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Proposition 8.7. There is an isomorphism 


Up to stabilization (tensoring with compact operators), the algebra has 
another crossed product description as 


with F4 an AF-algebra (approximately finite dimensional) algebra, a direct 
limit of finite dimensional C*-algebras). 
Let us consider the cochain complex of Hilbert spaces 


(40632275450 


Proposition 8.8. The C*-algebra O, admits a faithful representation on the 
Hilbert space L?(Ar, dp). 


This is obtained as follows. 
For dy = dimy(Ay) the Hausdorff dimension, consider the operators 


(Tif)(x) =| Y PH? for*s), and (Pif)(a) = xy(2) f(a), (8.3.21) 
where 7; are the generators of I’, and 
(Ty-1 f(a) == bo! [#2 F(ya), (8.3.22) 
for all y € I’. The 


Proposition 8.9. The operators 


Si:= )0 AyTy P; (8.3.23) 
J 


are partial isometries on L, satisfying the Cuntz-Krieger algebra relations for 
the matrix A of the subshift of finite type (8.3.1). Thus, the Cuntz—Krieger 
algebra O, can be identified with the subalgebra of bounded operators on the 
Hilbert space L?(Arp, 1) generated by the S; as in the equality (8.3.23). 


Indeed, the operators P; are orthogonal projectors, i.e. P,P; = 6;;P;. The 
composite T* P;T; satisfies 


d Ay(LP/T: N(2) = . (2), if P(x) =2 


0, otherwise. 


In fact, for « = g;y, we have 


yo, Aut PLif)\(e) = AT? Pf (or 9) = 
Ti (0; AP f(y) = T7 — Pita) f(y) = F(gey) = F(2). 
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In the remaining cases with « 4 g;y, we have 
ye Ai (TP PiTif)(x =e P39 = T; S- Ai itgPitgf(Git+g%) = 9, 
J 


where the last equality follows from the fact that the transition matrix A 
satisfies A; ;,, = 0. This implies that the S; satisfy 


5:8) =) ATP T=, 
J 


Since the projectors P; satisfy }>,; P; = I, we obtain the relation (8.3.17). 
Moreover, since T;7;* = 1, and the entries A;; are all zeroes and ones, we also 
obtain 


S35; =) > Aig AwPeTiT; P; = LA Ge ae: 
J 


j,k 


Replacing Pj = $;9; from (8.3.17) we then obtain (8.3.18). 


The Cuntz-Krieger algebra O, can be described in terms of the action 
of the free group I’ on its limit set Ap (cf. [Rob01], [Spi91]), so that we can 
regard O, as a noncommutative space replacing the classical quotient Ap/T’. 


Spectral triples for Schottky groups 


Let us consider the diagonal action of the algebra O, on the Hilbert space 
H=L9 EL, and define the Dirac operator as 


P\coo = > (n +1)(En & 0) 


n 


(8.3.24) 


Dloec ie Ss n(0 © In). 


n 


Theorem 8.10. For a Schottky group I with dimy(Ar) < 1, the data 
(O,H,D), forH = LCL with the diagonal action of O, through the rep- 
resentation (8.3.23) and the Dirac operator (8.8.24), define a non-finitely 
summable, @-summable spectral triple. 


The key point of this result is the compatibility relation between the al- 
gebra and the Dirac operator, namely the fact that the commutators [D, a] 
are bounded operators, for all a € OF. the involutive algebra generated 
algebraically by the $;, subject to Gants: Kriece: relations. 

This follows by an estimate on the norm of the commutators ||[D, 5j]|| 
and ||[D, S#]||, in terms of the Poincaré series of the Schottky group (using 
dy < 1): 
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.3 lye = TS da, 
ver” 


where the Hausdorff dimension dy is the exponent of convergence of the 
Poincaré series. 

The dimension of the n-th eigenspace of D is 2g(2g — 2)"~1(2g — 2) for 
n> 1, 2g for n = 0, and 2g(2g — 2)~"~!(2g — 2) for n < —1, so the spectral 
triple is not finitely summable, since |D|* is not of trace class. 

It is 6-summable, since the operator exp(—tD”) is of trace class, for all 
t>0. 

Using the description (8.3.20) of the noncommutative space as crossed 
product of an AF-algebra by the action of the shift, F4 «7 Z, one may be 
able to find a 1-summable spectral triple. Here the dense subalgebra should 
not contain any of the group elements. 


8.3.6 Arithmetic surfaces: homology and cohomology 


In particular case of arithmetic surfaces, there is an identification (found in 
[Cons98], [CM]) ; 

H (Cone,¢) -H®@H (8.3.25) 
where H is the Archimedean cohomology, and H its dual under the involution 


S of (8.2.17). 
We can extend then the identification 


U:HMSvVcL 


by considering a subspace W of the homology Hi(Sr) of a certain topolgical 
Smale space, attached to the Schottky uniformization of X(C), in such a way 
that WH. This homology group H;(S7) admit a very explicit combinato- 
rial description, and can be computed as a direct limit 


Hy (Sr,Z) =limKy, 


N 


where the groups Ky are free Abelian of rank (2g — 1) +1 for N even, and 
(2g —1)% + (29-1) for N odd. The Z-module Ky is generated by the closed 
geodesics represented by periodic sequences in S of period N +1. These need 
not be primitive closed geodesics. 

In terms of primitive closed geodesics one can write equivalently 


H,(Sr,Z) = @B Rn 
N=0 


where Ry are free Abelian groups with 


1 
tkR = So u(d)rkK ya 
d|N 
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(u(d) is the Mobius function). 
There is a natural pairing of homology and cohomology given by 


(,):FaxKy OZ ([f],2) = NfF(Z) (8.3.26) 


This determines a graded subspace W C Hj(Sr7,Z) dual to V C H'(Sr). 
With the identification 


Hi, vc H*4Sr) 
|s Lo (8.3.27) 
Hi as We A, (Srp,Z) 


one can identify the Dirac operator of (8.3.24) with the logarithm of Frobenius 


Divav = Pen (8.3.28) 


8.3.7 Archimedean factors from dynamics 


The dynamical spectral triple described in Theorem 8.10 is not finitely sum- 
mable. However, it is still possible to recover from these data the local factor 
at arithmetic infinity. 


As in the previous sections, we consider a fixed Archimedean prime given 
by a real embedding a : K — R, such that the corresponding Riemann surface 
Xz is an orthosymmetric smooth real algebraic curve of genus g > 2. The 
dynamical spectral triple provides another interpretation of the Archimedean 
factor Ly(H*(X7r,R), s) = Ic(s)9. 


Proposition 8.11. Consider the zeta functions 


Grv),D(82) = > Tr(m(V)H(A,D))(s—A)-*, (8.3.29) 


AESpec(D) 
for 7(V) the orthogonal projection on the norm closure of 0 ®V in H. 


The corresponding regularized determinants satisfy 


d Zi 
exp (—EGso.n/an(9/2m2)le-0) = Le(H'(X).8), (8.3.80 


(cf. Proposition 6.8 of [CM]). 


8.3.8 A Dynamical theory for Mumford curves 


Let K denote a finite extension of Q, and Ax the Bruhat-Tits tree asso- 
ciated to G = PGL(2, K). Let us recall few results about the action of a 
Schottky group on a Bruhat-Tits tree and on C*-algebras of graphs. Detailed 
explanations are contained in [Man76a], [Mum72], and [CM04al]. 
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Recall that the Bruhat—Tits tree is constructed as follows. One considers 
the set of free O-modules of rank 2: M Cc V. Two such modules are equivalent 
M, ~ Mg if there exists an element A € K*, such that M, = \Mg. The group 
GL(V) of linear automorphisms of V operates on the set of such modules on 
the left: gM = {gm |m€ M}, g € GL(V). Notice that the relation M, ~ M2 
is equivalent to the condition that M, and M2 belong to the same orbit of the 
center K* C GL(V). Hence, the group G = GL(V)/K™* operates (on the left) 
on the set of classes of equivalent modules. 

We denote by AQ. the set of such classes and by {M} the class of the 
module M. Because O is a principal ideals domain and every module M has 
two generators, it follows that 


{Mi}, {M2} € Ak, My > M2 > M,/M2 ~ O/m' 6 O/m*, LkEN. 


The multiplication of M, and M2 by elements of K preserves the inclusion 
M, D> Mg, hence the natural number 


d({My},{M2}) = |l— k| (8.3.31) 


is well defined. 

The graph Ax of the group PGL(2, K) is the infinite graph with set of ver- 
tices AY, in which two vertices {M1}, {M2} are adjacent and hence connected 
by an edge if and only if d({14,}, {M2}) = 1. (cf. [Man76a] and [Mum72].) 

For a Schottky group [Cc PGL(2, K) there is a smallest subtree A. C Ax 
containing the axes of all elements of I’. The set of ends of A‘, in P'(K) is 
Ap, the limit set of ’. The group I carries A’, into itself so that the quotient 
A’./T is a finite graph that coincides with the dual graph of the closed fibre 
of the minimal smooth model of the algebraic curve C/K holomorphically 
isomorphic to Xp := Qr/T (cf. [Mum72] p. 163). There is a smallest tree Ap 
on which I acts and such that Ar /T is the (finite) graph of the specialization 
of C. The curve C is a k-split degenerate, stable curve. When the genus of the 
fibers is at least 2 - i.e. when the Schottky group has at least g > 2 generators 
- the curve Xp is called a Schottky-Mumford curve. 

The possible graphs Ar/I and the corresponding fiber for the case of 
genus 2 are illustrated in Figure 8.15. 


Let us now describe a dynamical system associated to the space W(A/T) of 
walks on the directed tree A on which I acts. In particular, we are interested 
in the cases when A= Ax, Arp. 


For A = Ap, one obtains a subshift of finite type associated to the action 
of the Schottky group I’ on the limit set Ap, of the type that was considered 
in [CM]. 

Let V C Arp be a finite subtree whose set of edges consists of one represen- 
tative for each I-class. This is a fundamental domain for I" in the weak sense 
(following the notation of [Man76a]), since some vertices may be identified 


452 8 Arakelov Geometry and Noncommutative Geometry 


Fig. 8.15. The graphs Ar/I for genus g = 2, and the corresponding fibers. 


under the action of I. Correspondingly, V C P!(K) is the set of ends of all 
infinite paths starting at points in V. 

Consider the set W(Ar/I’) of doubly infinite walks on the finite graph 
Ar/I. These are doubly infinite admissible sequences in the finite alphabet 
given by the edges of V with both possible orientations. On W(Ar/I) we 
consider the topology generated by the sets W*(w,@) = {@ € W(Ar/T) : 
Op, = Week > CY and W"(w, 0) = {@ © W(Ap/T) : Op = wy,k < C}, for 
w € W(Ar/TI) and ¢ € Z. With this topology, the space W(Ar/T) is a 
totally disconnected compact Hausdorff space. 

The invertible shift map T, given by (Tw), = w41, is a homeomorphism 
of W(Ar/I). One can describe again the dynamical system (W(Ar/T),T) 
in terms of subshifts of finite type. 


Lemma 8.12. The space W(Ar/I’) with the action of the invertible shift T 
is a subshift of finite type, where W(Ar/I') = S4 with A the directed edge 
matrix of the finite graph Ar/T. 


Genus two example 


In the example of Mumford-Schottky curves of genus g = 2, the tree Ar is 
illustrated in Figure 8.16. 

In the first case in the Figure 8.16, the tree Ap is just a copy of the Cayley 
graph of the free group I’ on two generators, hence we can identify doubly 
infinite walks in Ar with doubly infinite reduced words in the generators of 
I and their inverses. The directed edge matrix is given by 
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Fig. 8.16. The graphs Ar /TI for genus g = 2, and the corresponding trees Arp. 


1101 
1110 
0111 
1011 


In the second case in Figure 8.16, we label by a = e1, b = eg and c= e3 
the oriented edges in the graph Ar/I, so that we have a corresponding set 
of labels E = {a,b,c,a,b,é} for the edges in the covering Ap. A choice of 
generators for the group I’ ~ Z* Z acting on Ap is obtained by identifying 
the generators g; and gz of I with the chains of edges ab and ac. Doubly 
infinite walks in the tree Ar are admissible doubly infinite sequences of such 
labels, where admissibility is determined by the directed edge matrix 


010001 
101000 
010100 
001010 
000101 
100010 


The third case in Figure 8.16 is analogous. A choice of generators for 
the group I. ~ Z* Z acting on Ap is given by aba and c. Doubly infinite 
walks in the tree Ar are admissible doubly infinite sequences in the alphabet 
E = {a,b,c,a, b,c}, with admissibility determined by the directed edge matrix 
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001001 
110000 
001100 
010010 
100010 
000101 


The construction is analogous for genus g > 2, for the various possible 
finite graphs Ap/I’. The directed edge matrix can then be written in block 


form as 
A- O11 12 
O21 22 
where each block ajj is a #(Ar/I)}. x #(Ar/I)}—matrix with a2 = aig, 
da, = af, and ay, = aby. 


8.3.9 Cohomology of W(A/I)r 


Let A = Arp. We identify the first cohomology group H!(W(Ar/T) 7, Z) with 
the group of homotopy classes of continuous maps of W(Ar/I’)r to the circle. 
Let C(W(Ar/T), Z) be the Z-module of integer valued continuous functions 
on W(A;/T), and let 


C(W(Ar/T),Z)r := Coker(6), 
for 6(f) = f — foT. The analog of Theorem 8.6 holds: 


Proposition 8.13. The map f + [exp(2nitf(x))], which associates to an 
element f € C(W(Ar/L),Z) a homotopy class of maps from W(Ar/I)r 
to the circle, gives an isomorphism C(W(Ar/L),Z)r ~ H!(W(Ar/T)r,Z). 
Moreover, there is a filtration of CW(Ar/L),Z)r by free Z-modules 


Fo CF, C::-CFy,:::, 


of rank 0, — On—-1 +1, where 6, is the number of admissible words of length 
n+ 1 in the alphabet, so that we have 


H'(W(Ar/T)r,Z) = lim Fn. 


The quotients F,41/F, are also torsion free. 


(cf. [CM04a]). 

The space W(Ar/I’)r corresponds to a space of “bounded geodesics” on 
the graph Ax /I’, where geodesics, in this setting, are just doubly infinite walks 
in Ax /I’. In particular, a closed geodesic is the image under the quotient map 
tr: AK — Ax /T of a doubly infinite walk in the Bruhat-Tits tree Ax with 
ends given by the pair z*(y),2~(y) of fixed points of some element y € I. 
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Similarly, a bounded geodesic is an element w € W(Ax/I) which is the 
image, under the quotient map, of a doubly infinite walk in Ax with both 
ends on Ap C P!(K). This implies that a bounded geodesic is a walk of the 
form w = tp(&), for some w € W(Ar/T). By construction, any such walk is 
an axis of Ap. 

Orbits of W(Ar/I’) under the action of the invertible shift T correspond 
bijectively to orbits of the complement of the diagonal in Ar x Ar under the 
action of I’. Thus, we see that W(A;r/I’)r gives a geometric realization of 
the space of “bounded geodesics” on the graph A, /I’, much as, in the case of 
the geometry at arithmetic infinity, we used the mapping torus of the shift T 
as a model of the tangle of bounded geodesics in a hyperbolic handlebody. 

As in the case at infinity, we can consider the Pimsner—Voiculescu exact 
sequence computing the K-theory groups of the crossed product C*-algebra 
CWW(Ar/TL)) »r Z, 


03 H°(W(Ar/P)r,Z) 3 COW(Ar/L),Z) °— >" CW(Ar/P),Z) 
— H'(W(Ar/T)r,Z) — 0. (8.3.32) 


In the corresponding sequence 
0 H°(W(Ar/P)r,6) 2 P 2+ P > H(W(Ar/P)r,k) 20, (8.3.33) 


for the cohomology for H*(W(Ar/I)r, «), with « = R or C, we can take the 
vector space P obtained, as in the case at infinity, by tensoring with « the 
Z-module P Cc C(W(Ar/T), Z) of functions of future coordinates where P ~ 
C(Wt (Ar/T), Z). This has a filtration P = UnPn, where P,, is identified with 
the submodule of C(Wt(Ar/I), Z) generated by characteristic functions of 
Wt (Ar/T,p) Cc Wt(Ar/TL), where p € W*(Ar/T) is a finite walk p = 
Wo+++ Wn of length n+ 1, and Wt(Ar/T, p) is the set of infinite paths w € 
Wt (Ar/T), with w, = wz for 0 < k < n+1. This filtration defines the 
terms F,, = P,/dPn—1 in the filtration of the dynamical cohomology of the 
Mumford curve, as in Proposition 8.13. Again, we will use the same notation 
in the following for the free Z-module P,, of functions of at most n+ 1 future 
coordinates and the vector space obtained by tensoring P,, by &. 

We obtain a Hilbert space completion of the space P of cochains in (8.3.33) 
by considering £ = L?(Ap, 1) defined with respect to the measure on Ap = 
OAr given by assigning its value on the clopen set V(v), given by the ends of 
all paths in Ap starting at a vertex v, to be 


w(V(v)) = q 4, 


with ¢ = card(O/m). 


In [CM] 84, it was shown how the mapping torus Sr of the subshift of finite 
type (S, T), associated to the limit set of the Schottky group, maps surjectively 
to the tangle of bounded geodesics inside the hyperbolic handlebody, through 
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a map that resolves all the points of intersection of different geodesics. In the 
case of the Mumford curve, where we replace the real hyperbolic 3-space by 
the Bruhat-Tits building Ax, the analog of the surjective map from S7 to 
the tangle of bounded geodesics is a map from W(Ar/I)r to the dual graph 
Ar/I. Here is a description of this map. 
As before, let us write elements of W(Ar/I’) as admissible doubly infinite 
sequences 
W = 2... Wi_, + Wi_ Wig Wi, - Wi 


with the w;, = {e;,, €, } oriented edges on the graph A;/I’. We consider each 
oriented edge w of normalized length one, so that it can be parameterized as 
w(t) = {e(t), ce}, for 0 <t <1, with w(t) = {e(1 — t), —e}. Since w € Sy is an 
admissible sequence of oriented edges we have w;,(1) = wi,,,(0) € A®), 

We consider a map of the covering space W(Ar/I) x R of W(Ar/I)r to 
|Ap| of the form 

E(w,7) = eel T= |r). (8.3.34) 

Here |Ar| denotes the geometric realization of the graph. By construction, 
the map E satisfies E(Tw, T)= E(w, 7 +1), hence it descends to a map F of 
the quotient 


We then obtain a map to |Ar/TI|, by composing with the quotient map of the 
I action, trp : Ap — Ar/T, that is, 


E:=npoE:W(Apr)r > |Ar/T|. (8.3.36) 
Thus, we obtain the following. 


Proposition 8.14. The map E of (8.3.36) is a continuous surjection from 
the mapping torus W(Ar)r to the geometric realization |Ar/I| of the finite 
graph Ar/T. 

8.3.10 Spectral triples and Mumford curves 


Let us consider the Hilbert space H = £@ L and the operator D defined as 


20 a oT i 
D = —-—_ +1), D = —— Ty, (8.3.37 
Ico =~ Fie Lint) lot = Bez Mn, (8.3.37) 
where ‘Ne = IT, — IT, 1 are the orthogonal projections associated to the 


filtration P,, the integer R is the length of all the words representing the 
generators of I’ (this can be taken to be the same for all generators, possibly 
after blowing up a finite number of points on the special fiber, as explained 
in [CM03]), and g = card(O/m). 


Theorem 8.15. Consider the tree Ar of the p-adic Schottky group acting on 
Ax. 
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1. There is a representation of the algebra C*(Ar/I’) by bounded linear op- 
erators on the Hilbert space L. 

2. The data (C*(Ap/T),H,D), with the algebra acting diagonally on H = 
L@®EL, and the Dirac operator D of (8.3.37) form a spectral triple. 


Recall that, for a curve X over a global field K, assuming semi-stability at 
all places of bad reduction, the local Euler factor at a place v has the following 
description ([Se70a]): 


L,(H1(X), 8) = det (1 — Fr*N(v)~°|H*(X,Q)”) (8.3.38) 


Here Fr* is the geometric Frobenius acting on ¢-adic cohomology of X = 
X ® Spec(K), with K the algebraic closure and ¢ a prime with (¢,q) = 1, 
where q is the cardinality of the residue field k(v) at v. We denote by N the 
norm map. The determinant is evaluated on the inertia invariants H!(X,Q,)!” 
at vu (all of H1(X,Qv) when v is a place of good reduction). 

Suppose v is a place of k(v)-split degenerate reduction. Then the comple- 
tion of X at v is a Mumford curve Xr. In this case, the Euler factor (8.3.38) 
takes the following form: 


L,(H"(Xr),s) =(1-—q7*)79. (8.3.39) 


This is computed by the zeta regularized determinant 


gadet 59) =D (Ap si, (8.3.40) 
where 
aiet (9) = exp (—Ciiv,+(5,0)) exp (—Ciiv,_(s,9)) , (8.3.41) 
for 


Ca,iv,+(8; z) = DrESpec(iD)Ni[0,00) Tr(alZy)(s + Al * 


(8.3.42) 
€a,iD,- (8, z) = » rESpec(éD)Ni(—00,0) Tr(alZy)(s 2G Ay 


The element a = 7(V) is the projection onto a linear subspace V of H, which 
is obtained via embeddings of the cohomology of the dual graph Ar/T into 
the space of cochains of the dynamical cohomology. 

The projection 7(V) acts on the range of the spectral projections II, of D 
as elements Q,, in the AF algebra core of the C*-algebra C*(Ar/T). 


8.4 Reduction mod oo 


In this section based on Section 7 and 8 in [CM], the “reduction mod infinity” 
is described in terms of the homotopy quotient associated to the noncom- 
mutative space O, of the previous §8.3, and the u-map of Baum—Connes. 
The geometric model of the dual graph can also be described as a homotopy 
quotient. 


8.4.1 Homotopy quotients and “reduction mod infinity” 


In the previous sections we have described the (noncommutative) geometry 
of the fiber at arithmetic infinity of an arithmetic surface in terms of its dual 
graph, which we obtained from two quotient spaces: the spaces 


Ar/T and Ap xp Ar ~ S8/Z, (8.4.1) 


with Z acting via the invertible shift T, which we can think of as the sets of 
vertices and edges of the dual graph, cf. §4.2 of [CM]. Their noncommutative 
geometry was analyzed in terms of Connes’ theory of spectral triples. 


Another fundamental construction in noncommutative geometry (cf. [Co83]) 
is that of homotopy quotients. These are commutative spaces, which provide, 
up to homotopy, geometric models for the corresponding noncommutative 
spaces. The noncommutative spaces themselves, as it can be shown in our 
case, appear as quotient spaces of foliations on the homotopy quotients with 
contractible leaves. 


The crucial point in our setting is that the homotopy quotient for the 
noncommutative space S/Z is precisely the mapping torus which gives the 
geometric model of the dual graph, 


Sr =S xzR, (8.4.2) 


where the noncommutative space S/Z can be identified with the quotient 
space of the natural foliation on (8.4.2) whose generic leaf is contractible (a 
copy of R). On the other hand, the case of the noncommutative space Ar/T is 
also extremely interesting. In fact, in this case the homotopy quotient appears 
very naturally and it describes what is refered as the “reduction mod oo” in 
[Man91]. 


Let us recall briefly how the reduction map works in the non-Archimedean 
setting of Mumford curves (cf. [Man91], [Mum72]). 

Let K be a finite extension of Q, and let Ox be its ring of integers. The 
correct analog for the Archimedean case is obtained by “passing to a limit”, 
replacing K with its Tate closure in C, (cf. [Man91] §3.1), however, for our 
purposes here it is sufficient to illustrate the case of a finite extension. 

The role of the hyperbolic space H® in the non-Archimedean case is played 
by the Bruhat-Tits tree Tgr with vertices 
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Tpr = {Ox — lattices of rank 2 in a 2-dim K-space}/K™. 


Vertices in Tr have valence |P!(Ox /m)|, where m is the maximal ideal. Each 
edge in Tgr has length log |Ox /m|. The set of ends of Tr is identified with 
X(K) = P'(K). This is the analog of the conformal boundary P!(C) of Hl’. 
Geodesics correspond to doubly infinite paths in Tyr without backtracking. 
Fix a vertex vp on Tgr. This corresponds to the closed fiber X_ ® (a/m) for 
the chosen a-structure Xq. Each x € P!(K) determines a unique choice of a 
subgraph e(vo,2) in Tgr with vertices (vo, v1, v2,...) along the half infinite 
path without backtracking which has end xz. The subgraphs e(vo, 2), with 
vertices (Vp, V1,...U,) correspond to the reduction mod m*, namely 


{e(vo, 2), : 2 © X(K)} o» Xq(a/m*). 


Thus the finite graphs e(vo, x), represent a/m* points, and the infinite graph 
e(vp, X) represents the reduction of x. 

A Schottky group I’, in this non-Archimedean setting, is a purely loxo- 
dromic free discrete subgroup of PSL(2, K) in g generators. The doubly infi- 
nite paths in Tgr with ends at the pairs of fixed points z*(y) of the elements 
+ € I produce a copy of the combinatorial tree J of the group I’ in Tgr. This 
is the analog of regarding H® as the union of the translates of a fundamental 
domain for the action of the Schottky group, which can be thought of as a 
‘tubular neighborhood’ of a copy of the Cayley graph T of I’ embedded in 
H?. The ends of the tree JT C Tgr constitute the limit set Ap C P!(K). The 
complement Qr = P!(K)\ Ar gives the uniformization of the Mumford curve 
X(K) ~ Qr/T. In turn, X(K) can be identified with the ends of the quotient 
graph Tp7/I, just as in the Archimedean case the Riemann surface is the 
conformal boundary at infinity of the handlebody Xp. 

The reduction map is then obtained by considering the half infinite paths 
e(v,x) in Tpr/T that start at a vertex v of the finite graph T/T and whose 
end x is a point of X(K), while the finite graphs e(v,x), provide the a/m* 
points. 


This suggests that the correct analog of the reduction map in the Archi- 
medean case is obtained by considering geodesics in H? with an end on Qr 
and the other on Ar, as described in [Man91] . Arguing as in Lemma 4.9 of 
[CM], we see that the set of such geodesics can be identified with the quotient 
Qp xp Ar. The analog of the finite graphs e(v, x), that define the reductions 
modulo m* is then given by the quotient H? x p Ap. 


Notice then that the quotient space 
Ap Xr H? = Ap xr ETD, (8.4.3) 
is precisely the homotopy quotient of Ap with respect to the action of I’, with 
ET = Hi and the classifying space B’' = H3/I = Xp, (cf. [Co83]). In this 
case also we find that the noncommutative space Ap /TI is the quotient space 
of a foliation on the homotopy quotient (8.4.3) with contractible leaves H°. 
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8.4.2 Baum-Connes map 


The relation between the noncommutative spaces (8.4.1) and the homotopy 
quotients (8.4.2) and (8.4.3) is an instance of a very general and powerful 
construction, namely the -map (cf. [BaCo], [Co83]). In particular, in the 
case of the noncommutative space C(S) 7 Z, the p-map 


pw: K**1(Sp) & H*t!(Sp,Z) — K,(C(S) x7 Z) (8.4.4) 


is the Thom isomorphism that gives the identification of (8.3.11), (8.3.12) 
and recovers the Pimsner—Voiculescu exact sequence (8.3.8) as in [Co81]. As 
explained in [CM], the map yp of (8.4.4) assigns to a K-theory class E € 
k**+(S xzR) the index of the longitudinal Dirac operator J- with coefficients 
€. This index is an element of the K-theory of the crossed product algebra 
C(S) xv Z and the p-map is an isomorphism. Similarly, in the case of the 
noncommutative space C(Ar) x I’, where we have a foliation on the total 
space with leaves H°, the -map 


pw: K** (Ap xp HP) > K,(C(Ar) xT) (8.4.5) 


is again given by the index of the longitudinal Dirac operator @e with co- 
efficients € € K*t!(Ap xp H?). In this case the map is an isomorphism 
because the Baum-—Connes conjecture with coefficients holds for the case of 
G = $Oo(3,1), with H3 = G/K and Ic G the Schottky group, cf. [Kas]. 


In particular, analyzing the noncommutative space C(Ar) x I from the 
point of view of the theory of spectral triples provides cycles to pair with 
K-theory classes constructed geometrically via the u-map. 


To complete the analogy with the reduction map in the case of Mumford 
curves, one should also consider the half infinite paths e(v,x) corresponding 
to the geodesics in Xp parameterized by Ar x p (2p, in addition to the finite 
graphs e(v, x), that correspond to the homotopy quotient (8.4.2). This means 
that the space that completely describes the “reduction modulo infinity” is a 
compactification of the homotopy quotient 


Ap xr (H? U 2r), (8.4.6) 


where ET = H® U 2p corresponds to the compactification of the classifying 
space BI = H?/l = Xp to BL = (HP UNr)/T = Xr U X/c, obtained by 
adding the conformal boundary at infinity of the hyperbolic handlebody. 
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Height 
Néron-Tate —, 229 
Néron-Tate —, 224 
Hensel’s Lemma, 138 
Hensel’s lemma, 134, 187 
Hermite Theorem, 248, 249 
Hermite’s Theorem, 125, 195, 250 
Hermitian metrization, 211, 399 
Hermitian vector bundle, 399 
Hilbert symbol, 139, 164 
Hilbert’s seventh problem, 61 
Hilbert’s tenth problem, 96, 113 
Hilbert’s Theorem 90, 180 
Hilbert’s twelfth problem, 169, 240 
Hodge decomposition, 293 
Hodge structure, 273 
Hurwitz 
formula , 248 
Hurwitz genus formula, 196 


Ideal, 13, 126 

— class group, 128 

Fractional —, 37, 127 

Maximal -, 127 

Prime —, 126 

Principal —, 126 

Principle, 13 
Ideal class group, 37 

Finiteness of —, 152 
Idele 

— class group, 154 
Idele group, 149 
Inertia group, 158 
Inertial degree, 142, 157 
Inflation-restriction sequence, 181, 378 
Integer, 116 

Integer 

p-adic —, 136 

Integral extension of a ring, 195 
integral linear programming, 24 
Integrally closed ring, 127 
Intersection number, 211, 399 
Intractable problem, 24 
Invariant of local ring 

kernel of augmentation, 370, 382 


tangent space, 370, 382 
Invariants of a local Noetherian 
O-algebra, 370, 382 
Isogeny of Abelian varieties, 226 
Iwasawa module, see Tate module of a 
number field 


Jacobi sum, 35, 73 
Jacobi symbol, 16 
Jacobi’s formula, 29 
Jacobian of a curve, 228 


Kloosterman sum 

distribution of —-s, 271 
Kloostermann sum, 35 
Knapsack problem, 24 
Kronecker’s “Jugendtraum”, 169 
Kronecker-Weber Theorem, 119 
Kronecker-Weber theorem, 155, 168, 

240 

Kummer extension, 175 
Kiinneth formula, 293 


l-adic realization of a motive, 293 
L-function 
— of a curve, 291 
— of a modular form, 296 
— of a motive, 294 
— of a rational representation, 272 
— of an automorphic representation, 
333 
Artin —, 276 
automorphic —, 336 
generalized Dirichlet —, 279 
Hecke-Weil —, 290 
l-adic —, 246 
standard —, 339 
L-group, 335 
L-homomorphism, 338 
L-packet, 339 
Lagrange resolution, 180 
Lamé’s theorem, 12 
Langlands functoriality principle, 338 
Langlands program, 5, 160, 169, 276, 
339 
Langlands’ conjecture, 339 
Lattice 
—in a real vector space, 120 
Least common multiple, 12 


Lefschetz fixed point formula, 266 
Lefschetz module, 437 
Legendre symbol, 15 
Level 

— of a quadratic form, 303 

— structure, 231 
level, 343 
limit set of the action, 422, 443 
Linear algebraic group, 231 
Local Artin map, 163-168 
Local Artin symbol 

Cohomological definition of —, 182 
Local degree, 146 
Local invariant, 186 
Localization, 133 
Logarithm 

Discrete, 14, 70 

Integral —, 19 

Logarithm 

p-adic —, 137 

Logarithmic map 

— of a number field, 122 
Lubin—Tate formal group, 168 


Mobius function, 19 
mapping torus, 443 
Mass of a quadratic form, 232 
Matiyasevich’s Theorem, 109 
Matiyasevich’s theorem, 96, 203 
Maximal ideal, 127 
Mazur module, 246 
Mazur’s theorem on the rational torsion 
of an elliptic curve, 225 
Measure 
— of Fundamental domain for a 
number field in its adele ring, 151 
— of fundamental domain for the idele 
classes, 153 
Haar —, 143 
Multiplicative invariant, 71 
of irrationality, 56 
self-dual — on the additive group of a 
local field, 282 
Tamagawa —, 234 
Mellin transform, 286 
Mersenne prime, 68 
Mersenne primes, 11 
Method 
— of Baker, 58 
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Hardy-Littlewood —, 38 
Hardy-Littlewood circle, 33 
Hardy-Littlewood circle —, 38 
of Apéry, 56, 58-60 
Ruler-and-compass, 4 
Secant-tangent —, 39 
Vinogradov’s — of exponential sums, 

34 

minimal conductor, 344 

Minkowski Theorem, 248 

Minkowski’s lemma on a convex body., 

123 

Minkowski’s Theorem, 125, 195 

Minkowski-Hasse principle, 23, 38, 187, 

189, 204, 231 

— for a quadratic form, 26 

Minkowski-Hasse theorem, 134, 189 

Minkowski-Siegel formula for the mass 

of a quadratic form, 233 

modular curve, 318, 343 

Modular form, 4, 297 

Siegel —, 337 

modular form, 343 

Modular function, 299 

Modular group, 217 

Module of an idele, 150 

moduli spaces, 257 

Mordell’s conjecture, 247 

Mordell’s theorem, 41 

Mordell-Weil theorem 

— for an Abelian variety, 229 

— for an elliptic curve, 221 

Mori’s theorem, 211 

Morita category, 410 

Morita equivalence, 410 


Motive, 292 
— of pure weight, 294 
false —, 292 
false effective —, 292 
true —, 293 


Motivic Galois group, 293 


Néron differential, 257 

Néron differential, 322 

Néron-Tate height, see Height 
Newton-Raphson algorithm, 134, 138 
Nilpotent element of a ring, 193 
Noetherian ring, 127 

Non-singular model (of a field), 213 
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Noncommutative space, 409, 443, 448 
Norm 
— of a closed point of an arithmetic 
scheme, 261 
— of an algebraic number, 116, 120 
— of an element of the Weil group, 289 
— of an ideal, 127 
of an algebraic number, 28, 36 
of an ideal, 37 
Reduced —, 186 
Relative — of a non—Archimedean 
place, 160 
Norm residue homomorphism, 163 
Norm residue symbol, 139 
Normal form 
of a cubic curve modulo a prime, 47 


Obstruction 
Brauer-Manin —, 205 
descent —, 205 
One-way function, 64 
Order, 132 
Orthogonal group, 233 
Ostrowski’s theorem, 133, 145 


p-adic 

p-adic 

— digit, 136 
p-adic 
— Logarithm, 137 

p-adic number, 134 
Parameterization 

rational —, 40 
Partition, 31 
Partition function, 31 

Application of circle method to —, 33 
PEL-family, 231 
Pell’s equation, 28, 50, 54, 98, 121 

Minimal solution to —, 28 
Perfect number, 11 
Period matrix of a polarized Abelian 

variety, 230 
Petersson inner product, 35, 302 
Petersson-Ramanujan conjecture, see 
Deligne’s estimate 

Picard group., 200 
Pimsner—Voiculescu exact sequence, 444 
Place of a field, 145 
Poincaré divisor, see divisor 


Point 
— of finite order on an elliptic curve, 
218 
geometric, 197 
regular —, 200 
ring-valued — of a scheme, 197 
singular —, 200 
Points of finite order 
— of Tate’s curve, 219, 299 
Poisson summation formula, 283 
Polarization 
— of an Abelian variety, 227 
algebraic — of an Abelian variety, 228 
canonical principal — of a Jacobian, 
229 
principal — of an Abelian variety, 227 
Polarized Abelian variety, see Abelian 
variety 
Pontryagin duality, 147 
Power residue symbol, 177, 184 
Prime 
Commercial —, 67 
Large gaps between —s, 66 
prime element, 126 
Prime ideal, 126, 193 
Prime number theorem, 18, 66 
Primitive element theorem, 117 
Primitive form, 333 
Primitive root, 14 
Primitively enumerable set, 109 
Principal ideal, 126 
principal ideal 
group of —s, 128 
Principal polarization, see Polarization 
Principle 
finiteness for the height, 258 
Problem 
Waring’s —, 35 
Product 
Restricted topological —, 147 
Product formula for absolute values, 
141, 146 
Product formula for Hilbert symbols, 
140 
Product formula for local symbols, 188 
Projective space over a ring, 197 
Pseudo-Abelian category, see Caroubien 
category 
Pseudoprime, 11 


Eulerian —, 16 
Strict —, 67 
Pythagorean triples, 24 


Quadratic character, 130 

Quadratic form, 231 
Ambiguous binary —, 85 
Equivalence of —s, 35 
Primitive binary —, 36 
Proper equivalence of —s, 36 
Reduced binary —, 37 

Quadratic reciprocity law, 140, 164 

Quadric, 25 

Quasicharacter, 71, 280 
— of a topological group, 289 

Quaternion algebra, 244 


Ramanujan’s congruence, 317 
Ramanujan’s function, 216, 242, 299, 
304 
Ramanujan-Petersson conjecture, 334 
Ramification group, 329 
Ramification index, 142, 157 
Rank 
Rank 
— of a cubic curve over Q, 41 
— of an Abelian variety, 229 
— of an elliptic curve 
upper bound on -, 223 
Rational parameterization, 57 
Rational parametrization 
— of the circle, 24 
Rational surface, 205 
Real multiplication, 169 
Real part of a quasicharacter, 280 
Reciprocity Law 
Quadratic, 71 
Reciprocity law 
Quadratic, 4, 15, 345 
recursive function, 97, 104 
Reduction 
good and bad, 212, 399 
Reduction of a scheme of arithmetic 
type modulo a prime ideal, 197 
Regular function, 193 
Regular point, 200 
regular prime, 342 
Regulator 
— of a number field, 123 
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— of an elliptic curve, 225 
Regulator of a number field, 153 
Reidemeister torsion, 437 
Relative invariant, 381, 382 
relative residue field degree, 160 
Representation 
— by a quadratic form, 29 
— of the Weil group, 289 
— of zero by a quadratic form, 25 
automorphic, 334 
automorphic —, 296, 335 
Binary —, 11 
Cyclotomic —, 166 
Galois —, 317-331 
— in positive characteristic, 243 
— of an Abelian variety, 226 
— of an elliptic curve, 218 
—on Etale cohomology, 226 
—on the Tate module of an Abelian 
variety, 239 
—on the Tate module of an elliptic 
curve, 238 
Abelian J-adic —, 242 
Abelian —, 275 
cyclotomic —, 219 
entire —, 272 
irreducible two dimensional complex 
—, 327 
modular —, 330, 354, 359 
rational —, 272 
two-dimensional — over a finite field, 
330 
unramified —, 272 
Galois-type — of the Weil group, 290 
number of —s by a quadratic form, 
303, 337 
spinor —, 337 
standard —, 336 
Restricted generality quantor, 113 
Restricted topological product, 147 
Riemann hypothesis, 21, 102 
— for an algebraic variety over a finite 
field, 265 
generalized, 290 
Generalized —, 67 
generalized —, 276 
Riemann periodicity relations, 228 
Riemann surface, 195, 421, 423 
Riemann-Roch theorem for curves, 213 
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Riemannian form, 227, see form 
Ring 
— of arithmetical type, 194 
Ring 
— of p-adic integers, 136 
complete intersection — 
local —, 370 
Gorenstein — 
local —, 371 
Integrally closed —, 127 
Krull —, 133 
Noetherian —, 127 
of integers in a number field, 36 
of residue classes, 13 
Valuation —, 133 
Ring of adeles, 146 
Ring of integers of a number field, 116, 
127 
Root data, 335 
Rosatti involution, 230, 250 
Ruler and compass method, 172 


S-unit, 152 
Sato—Tate Conjecture, 49, 271, 328, 334 
Scew homomorphism, 179 
Scheme, 196 

— of arithmetic type, 196 

— of geometric type, 196 

— over Finite Field, 263 

affine —, 196 

irreducible —, 196 

Scheme 

K- -, 196 

projective —, 198 
Schottky group, 419, 421, 422 
Schottky uniformization, 416, 419, 423 
Schur multiplier, 180 
Selberg trace formula, 339 
Selmer group 

— of an Abelian variety, 229 

Generalized — —of a Galois module, 

375 

of a Galois module, 375 

of an elliptic curve, 224, 375 
Semi-simplification of a representation, 

275 

semistable, 345 
Separable field extension, 117 


Serre’s conjecture on Galois represen- 
tations over finite fields, 330, 
354 
Shafarevich Conjecture 
for elliptic curves, 249 
Shafarevich conjecture, 247, 248 
on bad reduction, 259 
on finiteness, 251 
Shafarevich Theorem, 249 
Shafarevich—Tate group, 205 
of an elliptic curve, 224 
Shafarevich-Tate group 
— of an Abelian variety, 229 
Sheaf 
ample —, 201, 213 
canonical —, 201 
invertible —, 200 
metrized invertible —, 209 
very ample —, 201 
shift operator, 443 
Shimura variety, see Variety 
Shimura—Taniyama—Weil Conjecture, 
320, 330, 341 
Siegel modular group, 231 
Siegel Theorem, 249 
Siegel upper half space, 230 
Siegel’s formula 
— for representation numbers for 
quadratic forms, 232 
Siegel’s Theorem, 247, 250 
Sieve 
Eratosthenes —, 10 
Sign of an elliptic curve, 322 
Simple central algebra, 184 
Simple function, 103 
Singular point, 200 
Singular series, 34 
Skolem—Noether theorem, 186 
Smale space, 443 
Smoothness of a natural number, 86, 90 
Solution 
Solution 
L-valued, 191 
Solvable set, 112 
Sophie Germain prime, 68 
Specialization, 194 
spectral triples, 421 
Spectrum of a ring, 193 
Spiegelungssatz 


Leopoldt’s Spiegelungssatz, 128 
Split 

a place —s completely, 160 

Prime which -s completely in an 

extension, 129 

Stark’s conjectures, 171 
Stereographic projection, 25 
Structural morphism, 196 
Surface 

arithmetical —, 211, 399 
surface 

— Del Pezzo, 203 
suspension flow, 443 
Symplectic group, 226 


Tamagawa number, 234 
tangle of bounded geodesics, 421 
Tannakian category, 293 
Tate curve, 219, 299 
Tate field, 143 
Tate module 
— of a number field, 244 
— of an Abelian variety, 239 
— of an elliptic curve, 238 
— of the multiplicative group, 238 
Tate’s conjecture, 248 
for Abelian varieties, 252 
for elliptic curves, 249 
on isogenies, 252 
Teichmiiller representative, 136, 142 
Tensor product of fields 
Theorem on -, 119 
Theorem 
— of Coates-Wiles, 323 
— of Deligne-Serre, 119, 327 
finiteness for isogenies, 252 
Franchis de, 249 
on semisimplicity, 253 
Dirichlet, 250 
finiteness for forms of an Abelian 
variety, 251 
Hermite, 248-250 
Kronecker-Weber —, 119 
Main — of Galois theory, 117 
Minkowski, 248 
Shafarevich, 249 
Siegel, 247, 249, 250 
Torelli, 250 
Weil, 252 
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Theta function, 30 

cubic analogue of the —, 335 
Theta series, 303 
Thom isomorphism, 444 
Thue equation, 58 
Thue-Siegel-Roth theorem, 57 
Torelli’s Theorem, 250 
Torelli’s theorem, 229 
Torsion 

Torsion 

— of a cubic curve over Q, 41 

Torsor, 206 
totally ramified extension, 143 
Trace, 36 

— of Frobenius endomorphisms, 242 

— of an algebraic number, 116, 120 

Distribution of —s of Frobenius 

elements, 271 

Reduced —, 186 
Transcendental number, 61 
Transfer homomorphism, 162 
Transform 

Mellin —, 4 
Trap-door function, 64 


Unit in a number field, 121 
Unramified 
— integral extension, 195 
— place of a field, 158 
Unramified extension, 143 
Upper half plane, 296 


Valuation, 132 
— ring, 133 
Variety, 199 
Abelian, see Abelian variety 
absolutely (or geometrically) 
irreducible —, 199 
Kuga-Sato —, 318 
Shimura -—, 231 
variety 
— Fano, 203 
— general type, 203 
— intermediate type, 203 
Verlagerung, see Transfer homomor- 
phism 


Waring’s problem, 35 
Washington’s conjecture, 246 
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Weber-Fueter theorem, 240 for Abelian varieties over finite fields, 
Weierstrass normal form, 38, 214 252 
Weierstrass product expansion, 286 Wild invariant of a representation, 329 


Weil conjectures on the zeta function of 
a scheme over a finite field, 265 

Weil group, 288 

Weil pairing 


Zariski topology, 193 
Zeta function 
— of a hypersurface, 268 
— of an arithmetic scheme, 261 


—on an Abelian variety, 226 Dedekind —, 276 

— on an elliptic curve, 219 Hasse-Weil — —, 437 
Weil’s inverse theorem, 312 Riemann —, 19, 20, 261 

analogue of-for GL(3), 336 zeta function 


Weil’s Theorem — of a modular curve, 319 


